An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores
Heterogeneous manycore architectures have shown to be highly promising to boost power efficiency through two independent ways: (1) enabling massive thread-level parallelism, called "scale-out" approach, and (2) enabling thread migration between heterogeneous cores, called "scale-up&qu...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computers 2016-02, Vol.65 (2), p.367-381 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Heterogeneous manycore architectures have shown to be highly promising to boost power efficiency through two independent ways: (1) enabling massive thread-level parallelism, called "scale-out" approach, and (2) enabling thread migration between heterogeneous cores, called "scale-up" approach. How to accurately model the profitability of power efficiency of the two ways, particularly in an analytical and computational-effective manner, is essential to reap the power efficiency of such architectures. We propose a comprehensive analytical model to predict the power efficiency from the two independent ways. Given power efficiency is measured by performance per watt, this model is composed of a performance and a power model. The performance model is built by two orthogonal functions a and β. Function a describes the scale-out speedup from multithreading; function β presents the scale-up speedup from core heterogeneity. Thus, the performance model can clearly capture the overall speedup of any multithreading and thread-to-core mapping strategies. The power model predicts the power of corresponding scale-out and scale-up configurations. It simultaneously captures the power variations caused by thread synchronization and thread migration between heterogeneous cores. We build both performance and power model in an analytical way and keep the computational complexity in mind. This merit leads to a suit of comprehensive and low-complexity models for runtime management. These models are validated on large-scale heterogeneous manycore architecture with full-system simulations. For performance prediction, the average error is below 12 percent, lower than that of the state-of-the-art methods. For power prediction, the average error is 7.74 percent. On top of the models, we introduce two heuristic scheduling algorithms, performance-oriented MAX-P and power efficiency-oriented MAX-E, to demonstrate the usage of these models. The results show that MAX-P outperforms the state-of-the-art methods by 18 percent in performance averagely; MAX-E outperforms the baseline by 70 percent in power efficiency on average. |
---|---|
ISSN: | 0018-9340 1557-9956 |
DOI: | 10.1109/TC.2015.2419655 |