HyBar:high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip
Realizing barrier synchronization in multi-/many-core processors with high efficiency becomes more and more challenging as the number of cores integrated in a single chip keeps growing. Quite a few barrier solutions have been proposed, while they provide limited improvements for synchronizing large...
Gespeichert in:
Veröffentlicht in: | Science China. Information sciences 2017-06, Vol.60 (6), p.233-244, Article 062402 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Realizing barrier synchronization in multi-/many-core processors with high efficiency becomes more and more challenging as the number of cores integrated in a single chip keeps growing. Quite a few barrier solutions have been proposed, while they provide limited improvements for synchronizing large amounts of cores or incur unfavorable restrictions on performing concurrent barriers. This paper presents Hy Bar, a hardware barrier based on a hybrid switching No C which adopts packet switching and circuit switching methods in two sub-networks respectively. Dedicated channels in the circuit-switching sub-network are dynamically built and removed when barrier requests traverse the packet-switching sub-network according to a modified dimensionorder routing algorithm. The efficiency of inter-core communication for concurrent barriers is improved by merging barrier arrival requests and broadcasting release requests along the circuit channels. The execution time of synthetic cases, benchmark kernels and parallel applications using various barrier solutions are evaluated in an RTL-based simulation platform. Experimental results show that our proposal provides about 15%–50%performance improvement compared to previous solutions, while the hardware overhead is marginal under SMIC40 nm technology. Moreover, Hy Bar introduces a minor efficiency loss for concurrent barriers with no limitation on their layouts of participating cores in the on-chip network. |
---|---|
ISSN: | 1674-733X 1869-1919 |
DOI: | 10.1007/s11432-016-0306-y |