Design for fault-tolerance in system ES model 900
The authors present the design for fault-tolerance in the IBM ES/9000 Model 900 high-end commercial processor. The design exploits circuit level concurrent-error detection, fault-identification, and reconfiguration with system level techniques when multiple functional resources are available. It pro...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The authors present the design for fault-tolerance in the IBM ES/9000 Model 900 high-end commercial processor. The design exploits circuit level concurrent-error detection, fault-identification, and reconfiguration with system level techniques when multiple functional resources are available. It provides true graceful degradation during central processor or channel reconfiguration and repair. The authors discuss the design point for this processor and the trade-offs involved; show the error detection and online repair process of a central processor with the work recovered on an alternate central processor, transparent to the application; describe dynamic path selection and the hot-pluggable channels; and illustrate the fault-tolerance techniques used in the level 1 cache and the central store.< > |
---|---|
DOI: | 10.1109/FTCS.1992.243617 |