State-Retentive Power Gating of Register Files in Multicore Processors Featuring Multithreaded In-Order Cores
In this work, we investigate state-retentive power gating of register files for leakage reduction in multicore processors supporting multithreading. In an in-order core, when a thread gets blocked due to a memory stall, the corresponding register file can be placed in a low leakage state through pow...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computers 2011-11, Vol.60 (11), p.1547-1560 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we investigate state-retentive power gating of register files for leakage reduction in multicore processors supporting multithreading. In an in-order core, when a thread gets blocked due to a memory stall, the corresponding register file can be placed in a low leakage state through power gating for leakage reduction. When the memory stall gets resolved, the register file is activated for being accessed again. Since the contents of the register file are not lost and restored on wakeup, this is referred to as state-retentive power gating of register files. While state-retentive power gating in single cores has been studied in the literature, it is being investigated for multicore architectures for the first time in this work. We propose specific techniques to implement state-retentive power gating for three different multicore processor configurations based on the multithreading model: 1) coarse-grained multithreading, 2) fine-grained multithreading, and 3) simultaneous multithreading. The proposed techniques can be implemented as design extensions within the control units of the in-order cores. Each technique uses two different modes of leakage states: low-leakage savings and low wake-up and high-leakage savings and high wake-up latency. The overhead due to wake-up latency is completely avoided in two techniques while it is hidden for most part in the third approach, either by overlapping the wake-up process with the thread context switching latency or by executing instructions from other threads ready for execution. The proposed techniques were evaluated through simulations with multiprogrammed workloads comprised of SPEC 2000 integer benchmarks. Experimental results show that in an 8-core processor executing 64 threads, the average leakage savings were 42 percent in coarse-grained multithreading, while they were between seven percent and eight percent for finegrained and simultaneous multithreading. |
---|---|
ISSN: | 0018-9340 1557-9956 |
DOI: | 10.1109/TC.2010.249 |