REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES
Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representatio...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Venkataramani, Swagath Schaal, Marcel Srinivasan, Vijayalakshmi Nagarajan, Amrit Sen, Sanchari Ramji, Shyam |
description | Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size. |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2025005326A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2025005326A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2025005326A13</originalsourceid><addsrcrecordid>eNqNjMEKwjAQRHvxIOo_LHgWaku9b9NNuxi3kKQUvJQi8SRaqL_gfxuKH-BhGGbmMevkY6lzLDX0xHXjHaBUUDI6csASE6D1rFkxmlh4MoZrEkWASpEhi761oKMQhDobKSHft_a8lBVrTZbEw4WFS_SqAcfXeN7qeLdsitw2Wd3Hxxx2P98ke02RPYTpNYR5Gm_hGd5D57I0K9K0yLMTHvP_qC9VXz4V</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><source>esp@cenet</source><creator>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</creator><creatorcontrib>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</creatorcontrib><description>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2025</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20250102&DB=EPODOC&CC=US&NR=2025005326A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76289</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20250102&DB=EPODOC&CC=US&NR=2025005326A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Venkataramani, Swagath</creatorcontrib><creatorcontrib>Schaal, Marcel</creatorcontrib><creatorcontrib>Srinivasan, Vijayalakshmi</creatorcontrib><creatorcontrib>Nagarajan, Amrit</creatorcontrib><creatorcontrib>Sen, Sanchari</creatorcontrib><creatorcontrib>Ramji, Shyam</creatorcontrib><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><description>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2025</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjMEKwjAQRHvxIOo_LHgWaku9b9NNuxi3kKQUvJQi8SRaqL_gfxuKH-BhGGbmMevkY6lzLDX0xHXjHaBUUDI6csASE6D1rFkxmlh4MoZrEkWASpEhi761oKMQhDobKSHft_a8lBVrTZbEw4WFS_SqAcfXeN7qeLdsitw2Wd3Hxxx2P98ke02RPYTpNYR5Gm_hGd5D57I0K9K0yLMTHvP_qC9VXz4V</recordid><startdate>20250102</startdate><enddate>20250102</enddate><creator>Venkataramani, Swagath</creator><creator>Schaal, Marcel</creator><creator>Srinivasan, Vijayalakshmi</creator><creator>Nagarajan, Amrit</creator><creator>Sen, Sanchari</creator><creator>Ramji, Shyam</creator><scope>EVB</scope></search><sort><creationdate>20250102</creationdate><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><author>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2025005326A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2025</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Venkataramani, Swagath</creatorcontrib><creatorcontrib>Schaal, Marcel</creatorcontrib><creatorcontrib>Srinivasan, Vijayalakshmi</creatorcontrib><creatorcontrib>Nagarajan, Amrit</creatorcontrib><creatorcontrib>Sen, Sanchari</creatorcontrib><creatorcontrib>Ramji, Shyam</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Venkataramani, Swagath</au><au>Schaal, Marcel</au><au>Srinivasan, Vijayalakshmi</au><au>Nagarajan, Amrit</au><au>Sen, Sanchari</au><au>Ramji, Shyam</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><date>2025-01-02</date><risdate>2025</risdate><abstract>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_epo_espacenet_US2025005326A1 |
source | esp@cenet |
subjects | CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS |
title | REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T14%3A10%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Venkataramani,%20Swagath&rft.date=2025-01-02&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2025005326A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |