Model distillation method and related equipment

The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	QIAN LI, SHANG LIFENG, HOU LU, JIANG XIN, BAI HAOLI
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	QIAN LI SHANG LIFENG HOU LU JIANG XIN BAI HAOLI
description	The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域，公开了一种模型蒸馏方法，包括：在计算节点集群的第一计算节点处，通过学生模型的部分模型以及老师模型的部分模型，对学生模型进行蒸馏，且在蒸馏的梯度回传过程在第一计算节点的内部进行，不依赖于其他计算节点完成所负责的网络层的蒸馏，以此实现更大的计算资源利用率，进而实现蒸馏过程的加速。
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113850362A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113850362A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113850362A3</originalsourceid><addsrcrecordid>eNrjZND3zU9JzVFIySwuyczJSSzJzM9TyE0tychPUUjMS1EoSgWKpaYopBaWZhbkpuaV8DCwpiXmFKfyQmluBkU31xBnD93Ugvz41OKCxOTUvNSSeGc_Q0NjC1MDYzMjR2Ni1AAAQ3krHg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Model distillation method and related equipment</title><source>esp@cenet</source><creator>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</creator><creatorcontrib>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</creatorcontrib><description>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域，公开了一种模型蒸馏方法，包括：在计算节点集群的第一计算节点处，通过学生模型的部分模型以及老师模型的部分模型，对学生模型进行蒸馏，且在蒸馏的梯度回传过程在第一计算节点的内部进行，不依赖于其他计算节点完成所负责的网络层的蒸馏，以此实现更大的计算资源利用率，进而实现蒸馏过程的加速。</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211228&DB=EPODOC&CC=CN&NR=113850362A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211228&DB=EPODOC&CC=CN&NR=113850362A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>QIAN LI</creatorcontrib><creatorcontrib>SHANG LIFENG</creatorcontrib><creatorcontrib>HOU LU</creatorcontrib><creatorcontrib>JIANG XIN</creatorcontrib><creatorcontrib>BAI HAOLI</creatorcontrib><title>Model distillation method and related equipment</title><description>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域，公开了一种模型蒸馏方法，包括：在计算节点集群的第一计算节点处，通过学生模型的部分模型以及老师模型的部分模型，对学生模型进行蒸馏，且在蒸馏的梯度回传过程在第一计算节点的内部进行，不依赖于其他计算节点完成所负责的网络层的蒸馏，以此实现更大的计算资源利用率，进而实现蒸馏过程的加速。</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZND3zU9JzVFIySwuyczJSSzJzM9TyE0tychPUUjMS1EoSgWKpaYopBaWZhbkpuaV8DCwpiXmFKfyQmluBkU31xBnD93Ugvz41OKCxOTUvNSSeGc_Q0NjC1MDYzMjR2Ni1AAAQ3krHg</recordid><startdate>20211228</startdate><enddate>20211228</enddate><creator>QIAN LI</creator><creator>SHANG LIFENG</creator><creator>HOU LU</creator><creator>JIANG XIN</creator><creator>BAI HAOLI</creator><scope>EVB</scope></search><sort><creationdate>20211228</creationdate><title>Model distillation method and related equipment</title><author>QIAN LI ; SHANG LIFENG ; HOU LU ; JIANG XIN ; BAI HAOLI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113850362A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>QIAN LI</creatorcontrib><creatorcontrib>SHANG LIFENG</creatorcontrib><creatorcontrib>HOU LU</creatorcontrib><creatorcontrib>JIANG XIN</creatorcontrib><creatorcontrib>BAI HAOLI</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>QIAN LI</au><au>SHANG LIFENG</au><au>HOU LU</au><au>JIANG XIN</au><au>BAI HAOLI</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Model distillation method and related equipment</title><date>2021-12-28</date><risdate>2021</risdate><abstract>The invention relates to the field of artificial intelligence, and discloses a model distillation method, which comprises the steps of distilling a student model at a first calculation node of a calculation node cluster through a partial model of the student model and a partial model of a teacher model, and performing a gradient return process of distillation inside the first calculation node, and the distillation of the responsible network layer is completed without depending on other calculation nodes, so that the higher calculation resource utilization rate is realized, and the acceleration of the distillation process is further realized. 本申请涉及人工智能领域，公开了一种模型蒸馏方法，包括：在计算节点集群的第一计算节点处，通过学生模型的部分模型以及老师模型的部分模型，对学生模型进行蒸馏，且在蒸馏的梯度回传过程在第一计算节点的内部进行，不依赖于其他计算节点完成所负责的网络层的蒸馏，以此实现更大的计算资源利用率，进而实现蒸馏过程的加速。</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN113850362A
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
title	Model distillation method and related equipment
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T13%3A23%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=QIAN%20LI&rft.date=2021-12-28&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113850362A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true