Model distillation method and related equipment

The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: QIAN LI, SHANG LIFENG, HOU LU, JIANG XIN, BAI HAOLI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator QIAN LI
SHANG LIFENG
HOU LU
JIANG XIN
BAI HAOLI
description The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域,公开了一种模型蒸馏方法,包括:在计算节点集群的第一计算节点处,通过学生模型的部分模型以及老师模型的部分模型,对学生模型进行蒸馏,且在蒸馏的梯度回传过程在第一计算节点的内部进行,不依赖于其他计算节点完成所负责的网络层的蒸馏,以此实现更大的计算资源利用率,进而实现蒸馏过程的加速。
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113850362A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113850362A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113850362A3</originalsourceid><addsrcrecordid>eNrjZND3zU9JzVFIySwuyczJSSzJzM9TyE0tychPUUjMS1EoSgWKpaYopBaWZhbkpuaV8DCwpiXmFKfyQmluBkU31xBnD93Ugvz41OKCxOTUvNSSeGc_Q0NjC1MDYzMjR2Ni1AAAQ3krHg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Model distillation method and related equipment</title><source>esp@cenet</source><creator>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</creator><creatorcontrib>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</creatorcontrib><description>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域,公开了一种模型蒸馏方法,包括:在计算节点集群的第一计算节点处,通过学生模型的部分模型以及老师模型的部分模型,对学生模型进行蒸馏,且在蒸馏的梯度回传过程在第一计算节点的内部进行,不依赖于其他计算节点完成所负责的网络层的蒸馏,以此实现更大的计算资源利用率,进而实现蒸馏过程的加速。</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211228&amp;DB=EPODOC&amp;CC=CN&amp;NR=113850362A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211228&amp;DB=EPODOC&amp;CC=CN&amp;NR=113850362A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>QIAN LI</creatorcontrib><creatorcontrib>SHANG LIFENG</creatorcontrib><creatorcontrib>HOU LU</creatorcontrib><creatorcontrib>JIANG XIN</creatorcontrib><creatorcontrib>BAI HAOLI</creatorcontrib><title>Model distillation method and related equipment</title><description>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域,公开了一种模型蒸馏方法,包括:在计算节点集群的第一计算节点处,通过学生模型的部分模型以及老师模型的部分模型,对学生模型进行蒸馏,且在蒸馏的梯度回传过程在第一计算节点的内部进行,不依赖于其他计算节点完成所负责的网络层的蒸馏,以此实现更大的计算资源利用率,进而实现蒸馏过程的加速。</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZND3zU9JzVFIySwuyczJSSzJzM9TyE0tychPUUjMS1EoSgWKpaYopBaWZhbkpuaV8DCwpiXmFKfyQmluBkU31xBnD93Ugvz41OKCxOTUvNSSeGc_Q0NjC1MDYzMjR2Ni1AAAQ3krHg</recordid><startdate>20211228</startdate><enddate>20211228</enddate><creator>QIAN LI</creator><creator>SHANG LIFENG</creator><creator>HOU LU</creator><creator>JIANG XIN</creator><creator>BAI HAOLI</creator><scope>EVB</scope></search><sort><creationdate>20211228</creationdate><title>Model distillation method and related equipment</title><author>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113850362A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>QIAN LI</creatorcontrib><creatorcontrib>SHANG LIFENG</creatorcontrib><creatorcontrib>HOU LU</creatorcontrib><creatorcontrib>JIANG XIN</creatorcontrib><creatorcontrib>BAI HAOLI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>QIAN LI</au><au>SHANG LIFENG</au><au>HOU LU</au><au>JIANG XIN</au><au>BAI HAOLI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Model distillation method and related equipment</title><date>2021-12-28</date><risdate>2021</risdate><abstract>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域,公开了一种模型蒸馏方法,包括:在计算节点集群的第一计算节点处,通过学生模型的部分模型以及老师模型的部分模型,对学生模型进行蒸馏,且在蒸馏的梯度回传过程在第一计算节点的内部进行,不依赖于其他计算节点完成所负责的网络层的蒸馏,以此实现更大的计算资源利用率,进而实现蒸馏过程的加速。</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN113850362A
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
PHYSICS
title Model distillation method and related equipment
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T13%3A23%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=QIAN%20LI&rft.date=2021-12-28&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113850362A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true