A carbon-nanotube-based tensor processing unit

The growth of data-intensive computing tasks requires processing units with higher performance and energy efficiency, but these requirements are increasingly difficult to achieve with conventional semiconductor technology. One potential solution is to combine developments in devices with innovations...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature electronics 2024-08, Vol.7 (8), p.684-693
Hauptverfasser: Si, Jia, Zhang, Panpan, Zhao, Chenyi, Lin, Dongyi, Xu, Lin, Xu, Haitao, Liu, Lijun, Jiang, Jianhua, Peng, Lian-Mao, Zhang, Zhiyong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The growth of data-intensive computing tasks requires processing units with higher performance and energy efficiency, but these requirements are increasingly difficult to achieve with conventional semiconductor technology. One potential solution is to combine developments in devices with innovations in system architecture. Here we report a tensor processing unit (TPU) that is based on 3,000 carbon nanotube field-effect transistors and can perform energy-efficient convolution operations and matrix multiplication. The TPU is constructed with a systolic array architecture that allows parallel 2 bit integer multiply–accumulate operations. A five-layer convolutional neural network based on the TPU can perform MNIST image recognition with an accuracy of up to 88% for a power consumption of 295 µW. We use an optimized nanotube fabrication process that offers a semiconductor purity of 99.9999% and ultraclean surfaces, leading to transistors with high on-current densities and uniformity. Using system-level simulations, we estimate that an 8 bit TPU made with nanotube transistors at a 180 nm technology node could reach a main frequency of 850 MHz and an energy efficiency of 1 tera-operations per second per watt. Carbon nanotube networks made with high purity and ultraclean interfaces can be used to make a tensor processing unit that contains 3,000 transistors in a systolic array architecture to improve energy efficiency in accelerating neural network tasks.
ISSN:2520-1131
2520-1131
DOI:10.1038/s41928-024-01211-2