System recovery from errors for processor and associated components

A computer system includes a primary processor and a secondary processor running in lockstep. The lockstep may or may not be synchronous. Errors occurring in the primary processor or the secondary processor are reported to an error-handling module. If the error is a recoverable error, the state of o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KLECKA JAMES S, BUNTON WILLIAM P, KONDO THOMAS J, JARDINE ROBERT L, STOTT GRAHAM B
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A computer system includes a primary processor and a secondary processor running in lockstep. The lockstep may or may not be synchronous. Errors occurring in the primary processor or the secondary processor are reported to an error-handling module. If the error is a recoverable error, the state of one of the processors is saved and the processors are restarted using the saved state. In addition to the reporting of errors from the processors, cross checking of the operation of the processors is performed to detect a divergence in the operation of the processors. If the divergence is reported to be due to a recoverable error, the state of the one of the processors is saved and the processors are restarted using the saved state. Procedures are also disclosed to ensure that data corruption does not propagate onto an associated network, and to ensure that the system is not lost as a network resource during processor restart.