Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-02
Hauptverfasser: Mudigere, Dheevatsa, Hao, Yuchen, Huang, Jianyu, Jia, Zhihao, Tulloch, Andrew, Sridharan, Srinivas, Liu, Xing, Ozdal, Mustafa, Nie, Jade, Park, Jongsoo, Luo, Liang, Yang, Jie Amy, Gao, Leon, Ivchenko, Dmytro, Basant, Aarti, Hu, Yuxi, Yang, Jiyan, Ardestani, Ehsan K, Wang, Xiaodong, Komuravelli, Rakesh, Chu, Ching-Hsiang, Yilmaz, Serhat, Li, Huayu, Qian, Jiyuan, Feng, Zhuobo, Ma, Yinbin, Yang, Junjie, Wen, Ellie, Li, Hong, Yang, Lin, Sun, Chonglin, Zhao, Whitney, Melts, Dimitry, Dhulipala, Krishna, Kishore, K R, Graf, Tyler, Eisenman, Assaf, Matam, Kiran Kumar, Gangidi, Adi, Chen, Guoqiang Jerry, Krishnan, Manoj, Nayak, Avinash, Nair, Krishnakumar, Muthiah, Bharath, khorashadi, Mahmoud, Bhattacharya, Pallab, Lapukhov, Petr, Naumov, Maxim, Mathews, Ajit, Lin, Qiao, Smelyanskiy, Mikhail, Jia, Bill, Rao, Vijay
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!