Characterizing the Effectiveness of Hot Sparing on Cost and Performance-per-Watt in Application Specific SIMT

Adding redundant components is a well-known technique in the industry for replacing defective components, which results in yield improvement, and consequently, manufacturing cost reduction. Previously, most yield improvement strategies utilized redundant components only when another component had fa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Integration (Amsterdam) 2019-11, Vol.69, p.198-209
Hauptverfasser: Mozafari, S.H., Meyer, B.H.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Adding redundant components is a well-known technique in the industry for replacing defective components, which results in yield improvement, and consequently, manufacturing cost reduction. Previously, most yield improvement strategies utilized redundant components only when another component had failed (i.e., cold spares). However, utilizing hot spares is becoming popular in commercial products (e.g., NVIDIA Ti GPU series). Hot spares address manufacturing cost when the components are defective; otherwise, they can be used to improve performance in the field. In this paper, we investigate the performance improvement of hot spares to see if it can be used to improve performance per watt (PPW) in multi-core single-instruction, multiple-thread (SIMT) processors over different applications. Also, we investigate the cost and PPW implications of employing different types of hot spares in SIMT processors. Then, we study optimal solutions in the cost-PPW design space to see what kind of redundancy improves cost and PPW the most. However, since evaluating individual design points (different SIMT processor configurations with redundancy) is time consuming, we adapt a design space exploration algorithm to find near-optimal solutions without evaluating the design space exhaustively, which finds approximated optimal solutions three times better than conventional methods. We observe that hot sparing is effective for specific types of SIMT processor configurations (small and medium sized). On these configurations, it can improve PPW more than 16%, on average, for applications that experience significant performance improvement by adding hot spares (e.g., FFT and FILTER). Furthermore, we show that hot sparing's PPW improvement on these applications is comparable with the results of conventional techniques (e.g., voltage scaling) and can be utilized together with them to more effectively improve PPW in the systems. Also, we observed that microarchitectural hot redundant resources (e.g., hot shared-spare lanes) achieve better PPW improvement than conventional architectural redundancies (e.g., hot spare cores). CCS Concepts: • Hardware → Yield and cost optimization; Application specific processors; Redundancy; •We investigate the implications of adding hot-sparing to SIMT systems to improve PPW and cost.•We introduce an estimation technique to calculate expected PPW for SIMT systems with hot-sparing.•We introduce an adopted design space exploration to find optimal SIMT syst
ISSN:0167-9260
1872-7522
DOI:10.1016/j.vlsi.2019.03.010