Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training
Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the mu...
Gespeichert in:
Veröffentlicht in: | IEEE computer architecture letters 2024-07, Vol.23 (2), p.203-206 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 206 |
---|---|
container_issue | 2 |
container_start_page | 203 |
container_title | IEEE computer architecture letters |
container_volume | 23 |
creator | Cho, Haeyoon Son, Hyojun Choi, Jungmin Koh, Byungil Ha, Minho Kim, John |
description | Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%. |
doi_str_mv | 10.1109/LCA.2024.3445948 |
format | Article |
fullrecord | <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10654665</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10654665</ieee_id><sourcerecordid>10_1109_LCA_2024_3445948</sourcerecordid><originalsourceid>FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</originalsourceid><addsrcrecordid>eNpNkE9LxDAQxYMouK7ePXjIF-g6SSZpe1y66x-ouMjeS9pMpLJtlrQIfntbdhFPM_DeG978GLsXsBIC8seyWK8kSFwpRJ1jdsEWQmuTGDB4-bdrc81uhuELAI3KcMF2uxhsM7bfxLddTc61_ScPPS_CwfGNHS33IfIN0ZGXZGM_yx_UhK6j3tmxnaxvwdGB76NtZ_WWXXl7GOjuPJds_7TdFy9J-f78WqzLpDFSJ9L5WgvtjdFNhlNjU4PTtcpFWhsnPWCaZzA9psCjsSmSkrXEJiUCL3WqlgxOZ5sYhiGSr46x7Wz8qQRUM5BqAlLNQKozkCnycIq0RPTPbjRONdQvGeRbFA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><source>IEEE Electronic Library (IEL)</source><creator>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</creator><creatorcontrib>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</creatorcontrib><description>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</description><identifier>ISSN: 1556-6056</identifier><identifier>EISSN: 1556-6064</identifier><identifier>DOI: 10.1109/LCA.2024.3445948</identifier><identifier>CODEN: ICALC3</identifier><language>eng</language><publisher>IEEE</publisher><subject>Backpropagation ; Data models ; Deep learning ; Graphics processing units ; Parallel processing ; Pipelines ; recommendation system ; Training</subject><ispartof>IEEE computer architecture letters, 2024-07, Vol.23 (2), p.203-206</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</cites><orcidid>0000-0003-3958-3891 ; 0000-0002-5199-9139</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10654665$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10654665$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Cho, Haeyoon</creatorcontrib><creatorcontrib>Son, Hyojun</creatorcontrib><creatorcontrib>Choi, Jungmin</creatorcontrib><creatorcontrib>Koh, Byungil</creatorcontrib><creatorcontrib>Ha, Minho</creatorcontrib><creatorcontrib>Kim, John</creatorcontrib><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><title>IEEE computer architecture letters</title><addtitle>LCA</addtitle><description>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</description><subject>Backpropagation</subject><subject>Data models</subject><subject>Deep learning</subject><subject>Graphics processing units</subject><subject>Parallel processing</subject><subject>Pipelines</subject><subject>recommendation system</subject><subject>Training</subject><issn>1556-6056</issn><issn>1556-6064</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9LxDAQxYMouK7ePXjIF-g6SSZpe1y66x-ouMjeS9pMpLJtlrQIfntbdhFPM_DeG978GLsXsBIC8seyWK8kSFwpRJ1jdsEWQmuTGDB4-bdrc81uhuELAI3KcMF2uxhsM7bfxLddTc61_ScPPS_CwfGNHS33IfIN0ZGXZGM_yx_UhK6j3tmxnaxvwdGB76NtZ_WWXXl7GOjuPJds_7TdFy9J-f78WqzLpDFSJ9L5WgvtjdFNhlNjU4PTtcpFWhsnPWCaZzA9psCjsSmSkrXEJiUCL3WqlgxOZ5sYhiGSr46x7Wz8qQRUM5BqAlLNQKozkCnycIq0RPTPbjRONdQvGeRbFA</recordid><startdate>202407</startdate><enddate>202407</enddate><creator>Cho, Haeyoon</creator><creator>Son, Hyojun</creator><creator>Choi, Jungmin</creator><creator>Koh, Byungil</creator><creator>Ha, Minho</creator><creator>Kim, John</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-3958-3891</orcidid><orcidid>https://orcid.org/0000-0002-5199-9139</orcidid></search><sort><creationdate>202407</creationdate><title>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</title><author>Cho, Haeyoon ; Son, Hyojun ; Choi, Jungmin ; Koh, Byungil ; Ha, Minho ; Kim, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c625-2dfb515f665c844456b0d5b3917b6d2f04798011030f46a74e32b24c7ee0f2573</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Backpropagation</topic><topic>Data models</topic><topic>Deep learning</topic><topic>Graphics processing units</topic><topic>Parallel processing</topic><topic>Pipelines</topic><topic>recommendation system</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cho, Haeyoon</creatorcontrib><creatorcontrib>Son, Hyojun</creatorcontrib><creatorcontrib>Choi, Jungmin</creatorcontrib><creatorcontrib>Koh, Byungil</creatorcontrib><creatorcontrib>Ha, Minho</creatorcontrib><creatorcontrib>Kim, John</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE computer architecture letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cho, Haeyoon</au><au>Son, Hyojun</au><au>Choi, Jungmin</au><au>Koh, Byungil</au><au>Ha, Minho</au><au>Kim, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training</atitle><jtitle>IEEE computer architecture letters</jtitle><stitle>LCA</stitle><date>2024-07</date><risdate>2024</risdate><volume>23</volume><issue>2</issue><spage>203</spage><epage>206</epage><pages>203-206</pages><issn>1556-6056</issn><eissn>1556-6064</eissn><coden>ICALC3</coden><abstract>Deep learning recommendation model (DLRM) is an important class of deep learning networks that are commonly used in many applications. DRLM presents unique challenges, especially for scale-out training since it not only has compute and memory-intensive components but the communication between the multiple GPUs is also on the critical path. In this work, we propose how cold data in DLRM embedding tables can be exploited to propose proactive embedding. In particular, proactive embedding allows embedding table accesses to be done in advance to reduce the impact of the memory access latency by overlapping the embedding access with communication. Our analysis of proactive embedding demonstrates that it can improve overall training performance by 46%.</abstract><pub>IEEE</pub><doi>10.1109/LCA.2024.3445948</doi><tpages>4</tpages><orcidid>https://orcid.org/0000-0003-3958-3891</orcidid><orcidid>https://orcid.org/0000-0002-5199-9139</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1556-6056 |
ispartof | IEEE computer architecture letters, 2024-07, Vol.23 (2), p.203-206 |
issn | 1556-6056 1556-6064 |
language | eng |
recordid | cdi_ieee_primary_10654665 |
source | IEEE Electronic Library (IEL) |
subjects | Backpropagation Data models Deep learning Graphics processing units Parallel processing Pipelines recommendation system Training |
title | Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A02%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Proactive%20Embedding%20on%20Cold%20Data%20for%20Deep%20Learning%20Recommendation%20Model%20Training&rft.jtitle=IEEE%20computer%20architecture%20letters&rft.au=Cho,%20Haeyoon&rft.date=2024-07&rft.volume=23&rft.issue=2&rft.spage=203&rft.epage=206&rft.pages=203-206&rft.issn=1556-6056&rft.eissn=1556-6064&rft.coden=ICALC3&rft_id=info:doi/10.1109/LCA.2024.3445948&rft_dat=%3Ccrossref_RIE%3E10_1109_LCA_2024_3445948%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10654665&rfr_iscdi=true |