REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES

Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Venkataramani, Swagath, Schaal, Marcel, Srinivasan, Vijayalakshmi, Nagarajan, Amrit, Sen, Sanchari, Ramji, Shyam
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Venkataramani, Swagath
Schaal, Marcel
Srinivasan, Vijayalakshmi
Nagarajan, Amrit
Sen, Sanchari
Ramji, Shyam
description Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2025005326A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2025005326A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2025005326A13</originalsourceid><addsrcrecordid>eNqNjMEKwjAQRHvxIOo_LHgWaku9b9NNuxi3kKQUvJQi8SRaqL_gfxuKH-BhGGbmMevkY6lzLDX0xHXjHaBUUDI6csASE6D1rFkxmlh4MoZrEkWASpEhi761oKMQhDobKSHft_a8lBVrTZbEw4WFS_SqAcfXeN7qeLdsitw2Wd3Hxxx2P98ke02RPYTpNYR5Gm_hGd5D57I0K9K0yLMTHvP_qC9VXz4V</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><source>esp@cenet</source><creator>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</creator><creatorcontrib>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</creatorcontrib><description>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2025</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20250102&amp;DB=EPODOC&amp;CC=US&amp;NR=2025005326A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76289</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20250102&amp;DB=EPODOC&amp;CC=US&amp;NR=2025005326A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Venkataramani, Swagath</creatorcontrib><creatorcontrib>Schaal, Marcel</creatorcontrib><creatorcontrib>Srinivasan, Vijayalakshmi</creatorcontrib><creatorcontrib>Nagarajan, Amrit</creatorcontrib><creatorcontrib>Sen, Sanchari</creatorcontrib><creatorcontrib>Ramji, Shyam</creatorcontrib><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><description>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2025</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjMEKwjAQRHvxIOo_LHgWaku9b9NNuxi3kKQUvJQi8SRaqL_gfxuKH-BhGGbmMevkY6lzLDX0xHXjHaBUUDI6csASE6D1rFkxmlh4MoZrEkWASpEhi761oKMQhDobKSHft_a8lBVrTZbEw4WFS_SqAcfXeN7qeLdsitw2Wd3Hxxx2P98ke02RPYTpNYR5Gm_hGd5D57I0K9K0yLMTHvP_qC9VXz4V</recordid><startdate>20250102</startdate><enddate>20250102</enddate><creator>Venkataramani, Swagath</creator><creator>Schaal, Marcel</creator><creator>Srinivasan, Vijayalakshmi</creator><creator>Nagarajan, Amrit</creator><creator>Sen, Sanchari</creator><creator>Ramji, Shyam</creator><scope>EVB</scope></search><sort><creationdate>20250102</creationdate><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><author>Venkataramani, Swagath ; Schaal, Marcel ; Srinivasan, Vijayalakshmi ; Nagarajan, Amrit ; Sen, Sanchari ; Ramji, Shyam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2025005326A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2025</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Venkataramani, Swagath</creatorcontrib><creatorcontrib>Schaal, Marcel</creatorcontrib><creatorcontrib>Srinivasan, Vijayalakshmi</creatorcontrib><creatorcontrib>Nagarajan, Amrit</creatorcontrib><creatorcontrib>Sen, Sanchari</creatorcontrib><creatorcontrib>Ramji, Shyam</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Venkataramani, Swagath</au><au>Schaal, Marcel</au><au>Srinivasan, Vijayalakshmi</au><au>Nagarajan, Amrit</au><au>Sen, Sanchari</au><au>Ramji, Shyam</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES</title><date>2025-01-02</date><risdate>2025</risdate><abstract>Provided are a computer program product, system, and method for reusing weights and biases in an artificial intelligence accelerator for a neural network for different minibatch sizes of inferences. A minibatch size is selected of inference jobs batched to process in the accelerator. A representation of a neural network is processed to determine a set of weights and biases for the selected minibatch size to load into the core. The set of weights and biases is loaded into the core for use by the array of processing elements in the core of the accelerator. The weights and the biases are reused in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US2025005326A1
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
PHYSICS
title REUSING WEIGHTS AND BIASES IN AN ARTIFICIAL INTELLIGENCE ACCELERATOR FOR A NEURAL NETWORK FOR DIFFERENT MINIBATCH SIZES OF INFERENCES
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T14%3A10%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Venkataramani,%20Swagath&rft.date=2025-01-02&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2025005326A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true