Neural network model compression method and device, acceleration unit and computing system
The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | YAN CHENGYANG LI YINGMIN TU XIAOBIN LAO MAOYUAN MAO JUNWEI ZHANG WEIFENG |
description | The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN113762493A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN113762493A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN113762493A3</originalsourceid><addsrcrecordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Neural network model compression method and device, acceleration unit and computing system</title><source>esp@cenet</source><creator>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creator><creatorcontrib>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</creatorcontrib><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211207&DB=EPODOC&CC=CN&NR=113762493A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211207&DB=EPODOC&CC=CN&NR=113762493A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><title>Neural network model compression method and device, acceleration unit and computing system</title><description>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjb0OAUEYRbdRCN7h01OsEaKUDVFtpdJsJjPXmpi_zHxDvD0rHkB1inNu7ri6tChJWvLgZ0h3ckHDkgouJuRsgicHvgVN0mvSeBiFBUmlYJEkD754w187jAob31N-ZYabVqOrtBmzHyfV_Hg4N6clYuiQo1T4vHZNW9diu1mtd2Iv_mneYDc7hQ</recordid><startdate>20211207</startdate><enddate>20211207</enddate><creator>YAN CHENGYANG</creator><creator>LI YINGMIN</creator><creator>TU XIAOBIN</creator><creator>LAO MAOYUAN</creator><creator>MAO JUNWEI</creator><creator>ZHANG WEIFENG</creator><scope>EVB</scope></search><sort><creationdate>20211207</creationdate><title>Neural network model compression method and device, acceleration unit and computing system</title><author>YAN CHENGYANG ; LI YINGMIN ; TU XIAOBIN ; LAO MAOYUAN ; MAO JUNWEI ; ZHANG WEIFENG</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN113762493A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>YAN CHENGYANG</creatorcontrib><creatorcontrib>LI YINGMIN</creatorcontrib><creatorcontrib>TU XIAOBIN</creatorcontrib><creatorcontrib>LAO MAOYUAN</creatorcontrib><creatorcontrib>MAO JUNWEI</creatorcontrib><creatorcontrib>ZHANG WEIFENG</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>YAN CHENGYANG</au><au>LI YINGMIN</au><au>TU XIAOBIN</au><au>LAO MAOYUAN</au><au>MAO JUNWEI</au><au>ZHANG WEIFENG</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Neural network model compression method and device, acceleration unit and computing system</title><date>2021-12-07</date><risdate>2021</risdate><abstract>The embodiment of the invention discloses a compression method of a neural network model. The compression method comprises the following steps: acquiring a weight matrix of the neural network model; on the basis of rows or columns, dividing the weight matrix into a plurality of weight sets, wherein each weight set comprises a plurality of weight values, the data length of each weight set is determined on the basis of the bit width of an acceleration unit, and the acceleration unit is used for executing operation related to a neural network model; training the neural network model by adopting a weight group-based sparsification algorithm; carrying out weight group-based pruning on the trained weight matrix to obtain a sparse weight matrix; and storing the sparse weight matrix according to a predetermined storage format. The embodiment of the invention further discloses a corresponding neural network model compression device, a neural network model operation method, an acceleration unit, a calculation system, a</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN113762493A |
source | esp@cenet |
subjects | CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS |
title | Neural network model compression method and device, acceleration unit and computing system |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T14%3A45%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=YAN%20CHENGYANG&rft.date=2021-12-07&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN113762493A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |