Bufferless Network-on-Chips With Bridged Multiple Subnetworks for Deflection Reduction and Energy Savings

A bufferless network-on-chip (NoC) can deliver high energy efficiency, but such a NoC is subject to growing deflection when its traffic load rises. This article proposes Deflection Containment (DeC) for the bufferless NoC to address its notorious shortcomings of excessive deflection for performance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers 2020-04, Vol.69 (4), p.577-590
Hauptverfasser: Xiang, Xiyue, Sigdel, Purushottam, Tzeng, Nian-Feng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A bufferless network-on-chip (NoC) can deliver high energy efficiency, but such a NoC is subject to growing deflection when its traffic load rises. This article proposes Deflection Containment (DeC) for the bufferless NoC to address its notorious shortcomings of excessive deflection for performance improvement and energy savings. With multiple subnetworks bridged by an added link between two corresponding routers, DeC lets a contending flit in one subnetwork be forwarded to another subnetwork instead of deflected. Microarchitecture of DeC routers is rectified to shorten the critical path and lift network bandwidth. Its Cadence RTL implementations with a 15-nm 15-nm process are conducted respectively for mesh-based NoCs and torus-based NoCs. Additionally, different sized DeC-NoCs are evaluated extensively and compared with previous bufferless designs (BLESS and MinBD), uncovering that DeC with two bridged subnetworks (dubbed DeC2) for 8X8 mesh-based NoCs can lower deflection drastically by some 90 percent and energy consumption by upto 51 percent under real benchmark traffic loads, in comparison to BLESS. Under various synthetic traffic models and workloads, 16X16 torus-based DeC2-NoC sustains up to 2.33X loads when compared with its mesh-based counterpart, exhibiting the same clock rate and taking only negligible more power and area according to our full layout results.
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2019.2959307