Fault-tolerant execution of large parameter sweep applications across multiple VOs with storage constraints

Applications that span multiple virtual organizations (VOs) are of great interest to the e‐science community. However, our recent attempts to execute large‐scale parameter sweep applications (PSAs) for real‐world climate studies with the Nimrod/G tool have exposed problems in the areas of fault tole...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Concurrency and computation 2009-03, Vol.21 (3), p.377-392
Hauptverfasser: Ayyub, Shahaan, Abramson, David, Enticott, Colin, Garic, Slavisa, Tan, Jefferson
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Applications that span multiple virtual organizations (VOs) are of great interest to the e‐science community. However, our recent attempts to execute large‐scale parameter sweep applications (PSAs) for real‐world climate studies with the Nimrod/G tool have exposed problems in the areas of fault tolerance, data storage and trust management. In response, we have implemented a task‐splitting approach that facilitates breaking up large PSAs into a sequence of dependent subtasks, improving fault tolerance; provides a garbage collection technique that deletes unnecessary data; and employs a trust delegation technique that facilitates flexible third party data transfers across different VOs. Copyright © 2008 John Wiley & Sons, Ltd.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.1353