Hierarchical representation of on-chip context to reduce reconfiguration time and implementation area for coarse-grained reconfigurable architecture

In reconfigurable system, fast reconfiguration and small size of configuration contexts are strongly required to enhance the processing performance and reduce the implementation overhead. In this paper, a hierarchical representation of contexts for CGRA called HCC is proposed to satisfy the above re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Science China. Information sciences 2013-11, Vol.56 (11), p.275-294
Hauptverfasser: Wang, YanSheng, Liu, LeiBo, Yin, ShouYi, Zhu, Min, Cao, Peng, Yang, Jun, Wei, ShaoJun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In reconfigurable system, fast reconfiguration and small size of configuration contexts are strongly required to enhance the processing performance and reduce the implementation overhead. In this paper, a hierarchical representation of contexts for CGRA called HCC is proposed to satisfy the above requirements. In HCC, the contexts are constructed in a hierarchical fashion to thoroughly eliminate the repetitive portions of the contexts, not only reducing the overall contexts storage size, but also alleviating the contexts transportation overhead. The fast context-indexing mechanism is proposed in HCC to achieve high configuration speed, since the hierarchically organized contexts can be located and accessed conveniently. HCC has been verified in a reconfigurable processor called REMUS_HP. Owing to HCC, when implementing H.264 decoding on REMUS_HP, 76.67% of the overall contexts are reduced compared with the traditional non-hierarchical one; and the configuration speed is averagely 23× increased compared with the latest reported optimized configuration mechanism on Virtex-4 FX60. REMUS_HP is implemented on a 48.9 mm^2 silicon with TSMC 65 nm technology. Simulation shows that 1920 ×1088@30 fps could be achieved for H.264 high-profile decoding when exploiting a 200 MHz working frequency. Compared with the high performance version of XPP, the performance is 181% boosted.
ISSN:1674-733X
1869-1919
DOI:10.1007/s11432-013-4842-5