Optimal tracing and replay for debugging message-passing parallel programs

A techinque for tracing and replaying message-passing programs for debugging is presented. The technique is optimal in the common case and has good performance in the worst case. By making runtime tracing decisions, only a fraction of the total number of messages is traced, gaining two orders of mag...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Netzer, R.H.B., Miller, B.P.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A techinque for tracing and replaying message-passing programs for debugging is presented. The technique is optimal in the common case and has good performance in the worst case. By making runtime tracing decisions, only a fraction of the total number of messages is traced, gaining two orders of magnitude reduction over traditional techniques which trace every message. Experiments indicate that only 1% of the messages often need to be traced. These traces are sufficient to provide replay, allowing an execution to be reproduced any number of times for debugging. This work is novel in that runtime decisions are used to detect and trace only those messages that introduce nondeterminacy. With the proposed strategy, large reductions in trace size allow long-running programs to be replayed that were previously unmanageable. In addition, the reduced tracing experiments alleviate tracing bottlenecks, allowing executions to be debugged with substantially lower execution-time overhead.< >
DOI:10.1109/SUPERC.1992.236654