Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training

Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the mu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE computer architecture letters 2024-07, Vol.23 (2), p.203-206
Hauptverfasser: Cho, Haeyoon, Son, Hyojun, Choi, Jungmin, Koh, Byungil, Ha, Minho, Kim, John
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 206
container_issue 2
container_start_page 203
container_title IEEE computer architecture letters
container_volume 23
creator Cho, Haeyoon
Son, Hyojun
Choi, Jungmin
Koh, Byungil
Ha, Minho
Kim, John
description Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.
doi_str_mv 10.1109/LCA.2024.3445948
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10654665</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10654665</ieee_id><sourcerecordid>10_1109_LCA_2024_3445948</sourcerecordid><originalsourceid>FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</originalsourceid><addsrcrecordid>eNpNkE9LxDAQxYMouK7ePXjIF-g6SSZpe1y66x-ouMjeS9pMpLJtlrQIfntbdhFPM_DeG978GLsXsBIC8seyWK8kSFwpRJ1jdsEWQmuTGDB4-bdrc81uhuELAI3KcMF2uxhsM7bfxLddTc61_ScPPS_CwfGNHS33IfIN0ZGXZGM_yx_UhK6j3tmxnaxvwdGB76NtZ_WWXXl7GOjuPJds_7TdFy9J-f78WqzLpDFSJ9L5WgvtjdFNhlNjU4PTtcpFWhsnPWCaZzA9psCjsSmSkrXEJiUCL3WqlgxOZ5sYhiGSr46x7Wz8qQRUM5BqAlLNQKozkCnycIq0RPTPbjRONdQvGeRbFA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><source>IEEE Electronic Library (IEL)</source><creator>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</creator><creatorcontrib>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</creatorcontrib><description>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</description><identifier>ISSN: 1556-6056</identifier><identifier>EISSN: 1556-6064</identifier><identifier>DOI: 10.1109/LCA.2024.3445948</identifier><identifier>CODEN: ICALC3</identifier><language>eng</language><publisher>IEEE</publisher><subject>Backpropagation ; Data models ; Deep learning ; Graphics processing units ; Parallel processing ; Pipelines ; recommendation system ; Training</subject><ispartof>IEEE computer architecture letters, 2024-07, Vol.23 (2), p.203-206</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</cites><orcidid>0000-0003-3958-3891 ; 0000-0002-5199-9139</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10654665$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10654665$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Cho, Haeyoon</creatorcontrib><creatorcontrib>Son, Hyojun</creatorcontrib><creatorcontrib>Choi, Jungmin</creatorcontrib><creatorcontrib>Koh, Byungil</creatorcontrib><creatorcontrib>Ha, Minho</creatorcontrib><creatorcontrib>Kim, John</creatorcontrib><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><title>IEEE computer architecture letters</title><addtitle>LCA</addtitle><description>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</description><subject>Backpropagation</subject><subject>Data models</subject><subject>Deep learning</subject><subject>Graphics processing units</subject><subject>Parallel processing</subject><subject>Pipelines</subject><subject>recommendation system</subject><subject>Training</subject><issn>1556-6056</issn><issn>1556-6064</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9LxDAQxYMouK7ePXjIF-g6SSZpe1y66x-ouMjeS9pMpLJtlrQIfntbdhFPM_DeG978GLsXsBIC8seyWK8kSFwpRJ1jdsEWQmuTGDB4-bdrc81uhuELAI3KcMF2uxhsM7bfxLddTc61_ScPPS_CwfGNHS33IfIN0ZGXZGM_yx_UhK6j3tmxnaxvwdGB76NtZ_WWXXl7GOjuPJds_7TdFy9J-f78WqzLpDFSJ9L5WgvtjdFNhlNjU4PTtcpFWhsnPWCaZzA9psCjsSmSkrXEJiUCL3WqlgxOZ5sYhiGSr46x7Wz8qQRUM5BqAlLNQKozkCnycIq0RPTPbjRONdQvGeRbFA</recordid><startdate>202407</startdate><enddate>202407</enddate><creator>Cho, Haeyoon</creator><creator>Son, Hyojun</creator><creator>Choi, Jungmin</creator><creator>Koh, Byungil</creator><creator>Ha, Minho</creator><creator>Kim, John</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-3958-3891</orcidid><orcidid>https://orcid.org/0000-0002-5199-9139</orcidid></search><sort><creationdate>202407</creationdate><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><author>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Backpropagation</topic><topic>Data models</topic><topic>Deep learning</topic><topic>Graphics processing units</topic><topic>Parallel processing</topic><topic>Pipelines</topic><topic>recommendation system</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cho, Haeyoon</creatorcontrib><creatorcontrib>Son, Hyojun</creatorcontrib><creatorcontrib>Choi, Jungmin</creatorcontrib><creatorcontrib>Koh, Byungil</creatorcontrib><creatorcontrib>Ha, Minho</creatorcontrib><creatorcontrib>Kim, John</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE computer architecture letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cho, Haeyoon</au><au>Son, Hyojun</au><au>Choi, Jungmin</au><au>Koh, Byungil</au><au>Ha, Minho</au><au>Kim, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</atitle><jtitle>IEEE computer architecture letters</jtitle><stitle>LCA</stitle><date>2024-07</date><risdate>2024</risdate><volume>23</volume><issue>2</issue><spage>203</spage><epage>206</epage><pages>203-206</pages><issn>1556-6056</issn><eissn>1556-6064</eissn><coden>ICALC3</coden><abstract>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</abstract><pub>IEEE</pub><doi>10.1109/LCA.2024.3445948</doi><tpages>4</tpages><orcidid>https://orcid.org/0000-0003-3958-3891</orcidid><orcidid>https://orcid.org/0000-0002-5199-9139</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1556-6056
ispartof IEEE computer architecture letters, 2024-07, Vol.23 (2), p.203-206
issn 1556-6056
1556-6064
language eng
recordid cdi_ieee_primary_10654665
source IEEE Electronic Library (IEL)
subjects Backpropagation
Data models
Deep learning
Graphics processing units
Parallel processing
Pipelines
recommendation system
Training
title Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A02%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Proactive%20Embedding%20on%20Cold%20Data%20for%20Deep%20Learning%20Recommendation%20Model%20Training&rft.jtitle=IEEE%20computer%20architecture%20letters&rft.au=Cho,%20Haeyoon&rft.date=2024-07&rft.volume=23&rft.issue=2&rft.spage=203&rft.epage=206&rft.pages=203-206&rft.issn=1556-6056&rft.eissn=1556-6064&rft.coden=ICALC3&rft_id=info:doi/10.1109/LCA.2024.3445948&rft_dat=%3Ccrossref_RIE%3E10_1109_LCA_2024_3445948%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10654665&rfr_iscdi=true