Scalable Hardware Memory Disambiguation for High ILP Processors
This paper describes several methods for improving thescalability of memory disambiguation hardware for futurehigh ILP processors. As the number of in-flight instructionsgrows with issue width and pipeline depth, the load/storequeues (LSQ) threaten to become a bottleneck in both powerand latency. By...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper describes several methods for improving thescalability of memory disambiguation hardware for futurehigh ILP processors. As the number of in-flight instructionsgrows with issue width and pipeline depth, the load/storequeues (LSQ) threaten to become a bottleneck in both powerand latency. By employing lightweight approximate hashingin hardware with structures called Bloom filters manyimprovements to the LSQ are possible.We propose two types of filtering schemes using Bloomfilters: search filtering, which uses hashing to reduce boththe number of lookups to the LSQ and the number of entriesthat must be searched, and state filtering, in which thenumber of entries kept in the LSQs is reduced by couplingaddress predictors and Bloom filters, permitting smallerqueues. We evaluate these techniques for LSQs indexed byboth instruction age and the instruction's effective address,and for both centralized and physically partitioned LSQs.We show that search filtering avoids up to 98% of the associativeLSQ searches, providing significant power savingsand keeping LSQ searches to under one high-frequencyclock cycle. We also show that with state filtering, the loadqueue can be eliminated altogether with only minor reductionsin performance for small instruction window machines. |
---|---|
DOI: | 10.5555/956417.956553 |