IntAct: A 96-Core Processor With Six Chiplets 3D-Stacked on an Active Interposer With Distributed Interconnects and Integrated Power Management
In the context of high-performance computing, the integration of more computing capabilities with generic cores or dedicated accelerators for artificial intelligence (AI) application is raising more and more challenges. Due to the increasing costs of advanced nodes and the difficulties of shrinking...
Gespeichert in:
Veröffentlicht in: | IEEE journal of solid-state circuits 2021-01, Vol.56 (1), p.79-97 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the context of high-performance computing, the integration of more computing capabilities with generic cores or dedicated accelerators for artificial intelligence (AI) application is raising more and more challenges. Due to the increasing costs of advanced nodes and the difficulties of shrinking analog and circuit input output signals (IOs), alternative architecture solutions to single die are becoming mainstream. Chiplet-based systems using 3D technologies enable modular and scalable architecture and technology partitioning. Nevertheless, there are still limitations due to chiplet integration on passive interposers-silicon or organic. In this article we present the first CMOS active interposer, integrating: 1) power management without any external components; 2) distributed interconnects enabling any chiplet-to-chiplet communication; and3) system infrastructure, design-for-test, and circuit IOs. The IntAct circuit prototype integrates six chiplets in FDSOI 28-nm technology, which are 3D-stacked onto this active interposer in 65-nm process, offering a total of 96 computing cores. Full scalability of the computing system is achieved using an innovative scalable cache-coherent memory hierarchy, enabled by distributed network-on-chips, with 3-Tbit/s/mm 2 high bandwidth 3D-plug interfaces using 20- \mu \text{m} pitch micro-bumps, 0.6-ns/mm low latency asynchronous interconnects, while the six chiplets are locally power-supplied with 156-mW/mm2 at 82%-peak-efficiency dc-dc converters through the active interposer. Thermal dissipation is studied showing the feasibility of such approach. |
---|---|
ISSN: | 0018-9200 1558-173X |
DOI: | 10.1109/JSSC.2020.3036341 |