Analytical Modeling the Multi-Core Shared Cache Behavior with Considerations of Data-Sharing and Coherence

To mitigate the ever worsening "Power wall" and "Memory wall" problems, multi-core architectures with multilevel cache hierarchies have been widely accepted in modern processors. However, the complexity of the architectures makes modeling of shared caches extremely complex. In th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ling, Ming, Lu, Xiaoqian, Wang, Guangmin, Ge, Jiancong
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Hardware Architecture
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	To mitigate the ever worsening "Power wall" and "Memory wall" problems, multi-core architectures with multilevel cache hierarchies have been widely accepted in modern processors. However, the complexity of the architectures makes modeling of shared caches extremely complex. In this paper, we propose a data-sharing aware analytical model for estimating the miss rates of the downstream shared cache under multi-core scenarios. Moreover, the proposed model can also be integrated with upstream cache analytical models with the consideration of multi-core private cache coherent effects. This integration avoids time consuming full simulations of the cache architecture that required by conventional approaches. We validate our analytical model against gem5 simulation results under 13 applications from PARSEC 2.1 benchmark suites. Compared to the results from gem5 simulations under 8 hardware configurations including dual-core and quad-core architectures, the average absolute error of the predicted shared L2 cache miss rates is less than 2% for all configurations. After integrated with the refined upstream model with coherence misses, the overall average absolute error in 4 hardware configurations is degraded to 8.03% due to the error accumulations. The proposed coherence model can achieve similar accuracies of state of the art approach with only one tenth time overhead. As an application case of the integrated model, we also evaluate the miss rates of 57 different multi-core and multi-level cache configurations.
DOI:	10.48550/arxiv.2007.11195