On-the-fly replay: a practical paradigm and its implementation for distributed debugging

This paper presents a practical paradigm, called on-the-fly replay. This paradigm consists of running a distributed program twice at the same time: an original computation is running in a regular fashion, which also includes steps of making non-deterministic choices; this execution is driving a twin...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Gerstel, O., Zaks, S., Hurfin, M., Plouzeau, N., Raynal, M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computational modeling Computer science Debugging Delay Distributed computing Hardware Hypercubes Monitoring Parallel machines Probes
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a practical paradigm, called on-the-fly replay. This paradigm consists of running a distributed program twice at the same time: an original computation is running in a regular fashion, which also includes steps of making non-deterministic choices; this execution is driving a twin execution, whose non-deterministic choices do not have to be evaluated (since they are taken from the original computation). This paradigm has several interesting uses. Among them, distributed debugging is particularly noteworthy. The integration of this paradigm into a distributed debugging facility, called EREBUS, is described. This implementation was run on a distributed memory parallel machine (Intel Hypercube iPSC2) and experimental results are described, that demonstrate the advantage of this paradigm.< >
DOI:	10.1109/SPDP.1994.346158