Fault Tolerant Network on Chip Switching With Graceful Performance Degradation

The structural redundancy inherent to on-chip interconnection networks [networks on chip (NoC)] can be exploited by adaptive routing algorithms in order to provide connectivity even if network components are out of service due to faults, which will appear at an increasing rate with future chip techn...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computer-aided design of integrated circuits and systems 2010-06, Vol.29 (6), p.883-896
Hauptverfasser: Kohler, Adan, Schley, Gert, Radetzki, Martin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The structural redundancy inherent to on-chip interconnection networks [networks on chip (NoC)] can be exploited by adaptive routing algorithms in order to provide connectivity even if network components are out of service due to faults, which will appear at an increasing rate with future chip technology nodes. This paper is based on a new, fine-grained functional fault model and a corresponding distributed fault diagnosis method that facilitate determining the fault status of individual NoC switches and their adjacent communication links. Whereas previous work on network fault-tolerance assume switches to be either available or fully out of service, we present a novel adaptive routing algorithm that employs the remaining functionality of partly defective switches. Using diagnostic information, transient faults are handled with a retransmission scheme that avoids the latency penalty of end-to-end repeat requests. Thereby, graceful degradation of NoC communication performance can be achieved even under high failure rates.
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2010.2048399