Communication-Aware Globally-Coordinated On-Chip Networks

With continued Moore's law scaling, multicore-based architectures are becoming the de facto design paradigm for achieving low-cost and performance/power-efficient processing systems through effective exploitation of available parallelism in software and hardware. A crucial subsystem within mult...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2012-02, Vol.23 (2), p.242-254
Hauptverfasser: Yuho Jin, Eun Jung Kim, Pinkston, T. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With continued Moore's law scaling, multicore-based architectures are becoming the de facto design paradigm for achieving low-cost and performance/power-efficient processing systems through effective exploitation of available parallelism in software and hardware. A crucial subsystem within multicores is the on-chip interconnection network that orchestrates high-bandwidth, low-latency, and low-power communication of data. Much previous work has focused on improving the design of on-chip networks but without more fully taking into consideration the on-chip communication behavior of application workloads that can be exploited by the network design. A significant portion of this paper analyzes and models on-chip network traffic characteristics of representative application workloads. Leveraged by this, the notion of globally coordinated on-chip networks is proposed in which application communication behavior-captured by traffic profiling-is utilized in the design and configuration of on-chip networks so as to support prevailing traffic flows well, in a globally coordinated manner. This is applied to the design of a hybrid network consisting of a mesh augmented with configurable multidrop (bus-like) spanning channels that serve as express paths for traffic flows benefiting from them, according to the characterized traffic profile. Evaluations reveal that network latency and energy consumption for a 64-core system running OpenMP benchmarks can be improved on average by 15 and 27 percent, respectively, with globally coordinated on-chip networks.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2011.164