Optimal Generalized H-Tree Topology and Buffering for High-Performance and Low-Power Clock Distribution
Clock power, skew and maximum latency are three key metrics for clock distribution in low-power and high-performance designs. An H-tree offers minimum clock skew and good robustness against variations, but at the cost of large wirelength and clock power. On the other hand, a "fishbone" clo...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computer-aided design of integrated circuits and systems 2020-02, Vol.39 (2), p.478-491 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Clock power, skew and maximum latency are three key metrics for clock distribution in low-power and high-performance designs. An H-tree offers minimum clock skew and good robustness against variations, but at the cost of large wirelength and clock power. On the other hand, a "fishbone" clock network with spine-ribs structures has smaller wirelength, latency and clock power, but larger skew, as compared to an H-tree. No previous work enables systematic exploration of the regime between H-tree and spine to achieve an optimal tradeoff among clock power, skew, and latency. In this paper, we study the concept of a generalized H-tree (GH-tree)-a topologically balanced tree with an arbitrary sequence of branching factors-and propose a dynamic programming-based method to determine optimal clock power, skew, and latency, in the space of GH-tree solutions. Our method co-optimizes clock tree topology and buffering along branches according to fitted electrical models. We further propose a balanced K-means clustering and a linear programming (LP)-guided buffer placement approach to embed the GH-tree with respect to a given sink placement. We validate our solutions in commercial clock tree synthesis (CTS) tool flows, in a commercial foundry's 28LP technology. The results show up to 30% clock power reduction while achieving similar skew and maximum latency as CTS solutions from recent versions of leading commercial place-and-route tools. Our proposed approach also achieves up to 56% clock power reduction while achieving similar skew and maximum latency as compared to CTS solutions from a state-of-the-art academic tool. |
---|---|
ISSN: | 0278-0070 1937-4151 |
DOI: | 10.1109/TCAD.2018.2889756 |