Compiler-based cache line optimization

Generally, a microprocessor operates much faster than main memory can supply data to the microprocessor. Therefore, many computer systems temporarily store recently and frequently used data in smaller, but much faster cache memory. Cache memory may reside directly on the microprocessor chip (Level 1...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Kosche, Nicolai
Format:	Patent
Sprache:	eng
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Generally, a microprocessor operates much faster than main memory can supply data to the microprocessor. Therefore, many computer systems temporarily store recently and frequently used data in smaller, but much faster cache memory. Cache memory may reside directly on the microprocessor chip (Level 1 cache) or may be external to the microprocessor (Level 2 cache). In the past, on-chip cache memory was relatively small, 8 or 16 kilobytes (KB); however, more recent microprocessor designs have on-chip cache memories of 256 and even 512 KB. Cache line optimization involves computing where cache misses are in a control flow and assigning probabilities to cache misses. Cache lines may be scheduled based on the assigned probabilities and where the cache misses are in the control flow. Cache line probabilities may be calculated based on the relationship of the cache line and where the cache misses are in the control flow. A control flow may be pruned before calculating cache line probabilities. Function call sites may be used to prune the control flow. Address generation of a cache miss may be duplicated to speculatively hoist address generation and the associated prefetch. References may be selected for optimization, identifying cache lines, and mapping the selected references. Dependencies within the cache lines may be determined and the cache lines may be scheduled based on the determined dependencies and probabilities of usefulness. Instructions may be scheduled based on the scheduled cache lines and the target machine model to maximize outstanding memory transactions. Cache lines may be scheduled across call sites.