Neural network model compression method and device, acceleration unit and computing system

The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: YAN CHENGYANG, LI YINGMIN, TU XIAOBIN, LAO MAOYUAN, MAO JUNWEI, ZHANG WEIFENG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator YAN CHENGYANG
LI YINGMIN
TU XIAOBIN
LAO MAOYUAN
MAO JUNWEI
ZHANG WEIFENG
description The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113762493A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113762493A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113762493A3</originalsourceid><addsrcrecordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Neural network model compression method and device, acceleration unit and computing system</title><source>esp@cenet</source><creator>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creator><creatorcontrib>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creatorcontrib><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211207&amp;DB=EPODOC&amp;CC=CN&amp;NR=113762493A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20211207&amp;DB=EPODOC&amp;CC=CN&amp;NR=113762493A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><title>Neural network model compression method and device, acceleration unit and computing system</title><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</recordid><startdate>20211207</startdate><enddate>20211207</enddate><creator>YAN CHENGYANG</creator><creator>LI YINGMIN</creator><creator>TU XIAOBIN</creator><creator>LAO MAOYUAN</creator><creator>MAO JUNWEI</creator><creator>ZHANG WEIFENG</creator><scope>EVB</scope></search><sort><creationdate>20211207</creationdate><title>Neural network model compression method and device, acceleration unit and computing system</title><author>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113762493A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>YAN CHENGYANG</au><au>LI YINGMIN</au><au>TU XIAOBIN</au><au>LAO MAOYUAN</au><au>MAO JUNWEI</au><au>ZHANG WEIFENG</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Neural network model compression method and device, acceleration unit and computing system</title><date>2021-12-07</date><risdate>2021</risdate><abstract>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN113762493A
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
PHYSICS
title Neural network model compression method and device, acceleration unit and computing system
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T14%3A45%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=YAN%20CHENGYANG&rft.date=2021-12-07&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113762493A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true