Neural network model compression method and device, acceleration unit and computing system

The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	YAN CHENGYANG, LI YINGMIN, TU XIAOBIN, LAO MAOYUAN, MAO JUNWEI, ZHANG WEIFENG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	YAN CHENGYANG LI YINGMIN TU XIAOBIN LAO MAOYUAN MAO JUNWEI ZHANG WEIFENG
description	The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113762493A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113762493A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113762493A3</originalsourceid><addsrcrecordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Neural network model compression method and device, acceleration unit and computing system</title><source>esp@cenet</source><creator>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creator><creatorcontrib>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creatorcontrib><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211207&DB=EPODOC&CC=CN&NR=113762493A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211207&DB=EPODOC&CC=CN&NR=113762493A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><title>Neural network model compression method and device, acceleration unit and computing system</title><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</recordid><startdate>20211207</startdate><enddate>20211207</enddate><creator>YAN CHENGYANG</creator><creator>LI YINGMIN</creator><creator>TU XIAOBIN</creator><creator>LAO MAOYUAN</creator><creator>MAO JUNWEI</creator><creator>ZHANG WEIFENG</creator><scope>EVB</scope></search><sort><creationdate>20211207</creationdate><title>Neural network model compression method and device, acceleration unit and computing system</title><author>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113762493A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>YAN CHENGYANG</au><au>LI YINGMIN</au><au>TU XIAOBIN</au><au>LAO MAOYUAN</au><au>MAO JUNWEI</au><au>ZHANG WEIFENG</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Neural network model compression method and device, acceleration unit and computing system</title><date>2021-12-07</date><risdate>2021</risdate><abstract>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	chi ; eng
recordid	cdi_epo_espacenet_CN113762493A
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
title	Neural network model compression method and device, acceleration unit and computing system
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T14%3A45%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=YAN%20CHENGYANG&rft.date=2021-12-07&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113762493A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true