An Analysis of Loop Latency in Dataflow Execution

Recent evidence indicates that the exploitation of locality in dataflow programs could have, a dramatic impact on performance. The current trend in the design of dataflow processors suggest a synthesis of traditional non-strict fine grain instruction execution and a strict coarse grain execution in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Najjar, W.A., Miller, W.M., Bohm, A.P.W.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent evidence indicates that the exploitation of locality in dataflow programs could have, a dramatic impact on performance. The current trend in the design of dataflow processors suggest a synthesis of traditional non-strict fine grain instruction execution and a strict coarse grain execution in order to exploit locality. While an increase in instruction granularity will favor the exploitation of locality within a single execution thread, the resulting grain size may increase latency among execution threads. In this paper, the resulting latency incurred through the partitioning of fine grain instructions into coarser grain threads is evaluated. We define the concept of a cluster of fine grain instructions to qualify coarse grain input and output latencies using a set of numeric benchmarks. The results offer compelling evidence that the inner loops of a significant number of numeric codes would benefit front coarse grain execution. Based on cluster execution fines, more than 60% of the measured benchmarks favor a coarse grain execution. In 63% of the cases the input latency to the cluster is the same in coarse or fine grain execution modes. These results suggest that the effects of increased instruction granularity on latency is minimal for a high percentage of the measured codes) and in large part is offset by available intra-thread locality. Furthermore, simulation results indicate that strict or non-strict data structure access does not change the basic 0uster characteristics.
DOI:10.1109/ISCA.1992.753331