Fast timing-model independent buffered clock-tree synthesis

In high-performance synchronous chip design, a buffered clock tree with small clock skew is essential for improving clocking speed. Due to the insufficient accuracy of timing models for modern chip design, embedding simulation into a clock-tree synthesis flow becomes inevitable. Consequently, the ru...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shih, Xin-Wei, Chang, Yao-Wen
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In high-performance synchronous chip design, a buffered clock tree with small clock skew is essential for improving clocking speed. Due to the insufficient accuracy of timing models for modern chip design, embedding simulation into a clock-tree synthesis flow becomes inevitable. Consequently, the runtime for clock-tree synthesis becomes prohibitively huge as the complexity of chip designs grows rapidly. To construct a buffered clock tree efficiently, we propose an ultra fast timing-model independent approach to perform skew minimization by structure optimization. To achieve the goal, a novel clock-tree structure, called symmetrical structure, is presented. At each level of a symmetrical clock tree, the number of branches, the wire-length, and the inserted buffers are almost the same. It is natural that the clock skew could be minimized if the configurations of all paths from the clock source to sinks are similar. By symmetrically constructing a clock tree, the clock skew can be minimized without referring to simulation information. Experimental results show that our approach can not only efficiently construct a buffered clock tree, but also effectively minimize clock skew with marginal wiring overheads. Based on a set of commonly used IBM benchmarks, for example, a state-of-the-art work without (with) ngspice simulation results in averagely 7.93X (2.77X) clock skew and requires 46X (24343X) runtime over our approach.
ISSN:0738-100X
DOI:10.1145/1837274.1837296