Probabilistic system-level fault diagnostic algorithms for multiprocessors
Massively parallel computers (MFCs) introduce new requirements for system-level fault diagnosis, like handling a huge number of processing elements in a heterogeneous system. They also have specific attributes, such as regular topology and low local complexity. Traditional deterministic methods of s...
Gespeichert in:
Veröffentlicht in: | Parallel computing 1997-01, Vol.22 (13), p.1807-1821 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Massively parallel computers (MFCs) introduce new requirements for system-level fault diagnosis, like handling a huge number of processing elements in a heterogeneous system. They also have specific attributes, such as regular topology and low local complexity. Traditional deterministic methods of system-level diagnosis did not consider these issues. This paper presents a new approach, called
local information diagnosis that exploits the characteristics of massively parallel systems. The paper defines the diagnostic model, which is based on generalized test invalidation to handle inhomogeneity in multiprocessors. Five effective probabilistic diagnostic algorithms using the proposed method are also given, and their space and time complexity are estimated. |
---|---|
ISSN: | 0167-8191 1872-7336 |
DOI: | 10.1016/S0167-8191(96)00078-6 |