CUTBUF: Buffer Management and Router Design for Traffic Mixing in VNET-Based NoCs

Router's buffer design and management strongly influence energy, area and performance of on-chip networks, hence it is crucial to encompass all of these aspects in the design process. At the same time, the NoC design cannot disregard preventing network-level and protocol-level deadlocks by devo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2016-06, Vol.27 (6), p.1603-1616
Hauptverfasser: Zoni, Davide, Flich, Jose, Fornaciari, William
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Router's buffer design and management strongly influence energy, area and performance of on-chip networks, hence it is crucial to encompass all of these aspects in the design process. At the same time, the NoC design cannot disregard preventing network-level and protocol-level deadlocks by devoting ad-hoc buffer resources to that purpose. In chip multiprocessor systems the coherence protocol usually requires different virtual networks (VNETs) to avoid deadlocks. Moreover, VNET utilization is highly unbalanced and there is no way to share buffers between them due to the need to isolate different traffic types. This paper proposes CUTBUF, a novel NoC router architecture to dynamically assign virtual channels (VCs) to VNETs depending on the actual VNETs load to significantly reduce the number of physical buffers in routers, thus saving area and power without decreasing NoC performance. Moreover, CUTBUF allows to reuse the same buffer for different traffic types while ensuring that the optimized NoC is deadlock-free both at network and protocol level. In this perspective, all the VCs are considered spare queues not statically assigned to a specific VNET and the coherence protocol only imposes a minimum number of queues to be implemented. Synthetic applications as well as real benchmarks have been used to validate CUTBUF, considering architectures ranging from 16 up to 48 cores. Moreover, a complete RTL router has been designed to explore area and power overheads. Results highlight how CUTBUF can reduce router buffers up to 33 percent with 2 percent of performance degradation, a 5 percent of operating frequency decrease and area and power saving up to 30.6 and 30.7 percent, respectively. Conversely, the flexibility of the proposed architecture improves by 23.8 percent the performance of the baseline NoC router when the same number of buffers is used.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2015.2468716