Fault handler for a multiple computer system

A fault handler for each computer in a multiple computer system is disclosed, which excludes faulty computers from participating in the operation of the multiple computer system. The fault handler comprises one or more message checkers (216, 218, 220, 222 and 224) and a fault tolerator (228). In the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: FREEDMAN, MORRIS D, TASAR, OMUR, WHITESIDE, ARLISS E
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A fault handler for each computer in a multiple computer system is disclosed, which excludes faulty computers from participating in the operation of the multiple computer system. The fault handler comprises one or more message checkers (216, 218, 220, 222 and 224) and a fault tolerator (228). In the disclosed embodiment, the fault handler further includes a synchronizer (226) for synchronizing the operation of the associated computer with the operation of the other computers in the system. The checker modules check each message received from the other computers, and forward the messages to the fault tolerator. From error messages received from the other computers and errors detected by its own checker modules, the fault tolerator (228) decides which computers are faulty and discards the messages received from those computers. Only messages received from non-faulty computers are passed on for further processing. When an error is detected in a message, the fault tolerator sends a message to all of the other computers identifying the computer which sent the message.