MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION

Certain aspects of the present disclosure provide techniques for performing mixed precision quantization of a machine learning model. In one example, a method includes determining a sensitivity value for each of one or more quantizers, wherein each quantizer is associated with one or more non-overla...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	FOURNARAKIS, Marios, NAGEL, Markus, PETERS, Jorn Wilhelmus Timotheus, VAN BAALEN, Marinus Willem
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	FOURNARAKIS, Marios NAGEL, Markus PETERS, Jorn Wilhelmus Timotheus VAN BAALEN, Marinus Willem
description	Certain aspects of the present disclosure provide techniques for performing mixed precision quantization of a machine learning model. In one example, a method includes determining a sensitivity value for each of one or more quantizers, wherein each quantizer is associated with one or more non-overlapping elements of a machine learning model architecture; and determining a bitwidth allocation for each of the one or more quantizers by solving an optimization problem defined by at least: an optimization objective of minimizing total sensitivity for the machine learning model architecture based on the bitwidth allocation; and one or more constraints. Certains aspects de la présente divulgation concernent des techniques pour effectuer une quantification de précision mixte d'un modèle d'apprentissage automatique. Dans un exemple, un procédé consiste à déterminer une valeur de sensibilité pour un quantificateur ou pour chacun de plusieurs quantificateurs, chaque quantificateur étant associé à un ou plusieurs éléments ne se chevauchant pas d'une architecture de modèle d'apprentissage automatique ; et à déterminer une attribution de largeur de bits pour le quantificateur ou pour chacun des quantificateurs en résolvant un problème d'optimisation défini par au moins : un objectif d'optimisation consistant à réduire à un minimum la sensibilité totale pour l'architecture de modèle d'apprentissage automatique sur la base de l'attribution de largeur de bits ; et une ou plusieurs contraintes.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_WO2024186332A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>WO2024186332A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_WO2024186332A13</originalsourceid><addsrcrecordid>eNqNjM0KgkAURt20iOodLrQWUiPcXmamvODcMefa30YkplWUYO9PCT5Am-9w4PDNo2DpYnRc1UaRJ8dwbJCFbiijEINFVRAbKA3WTHyAxo9rnTYleMOehE4kV0DWoBx7qfHXa3CVkJ2OltHs0T2HsJq4iNZ7I6qIQ_9uw9B39_AKn_bs0k26TfJdlqWYZP9VX7ARNmM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION</title><source>esp@cenet</source><creator>FOURNARAKIS, Marios ; NAGEL, Markus ; PETERS, Jorn Wilhelmus Timotheus ; VAN BAALEN, Marinus Willem</creator><creatorcontrib>FOURNARAKIS, Marios ; NAGEL, Markus ; PETERS, Jorn Wilhelmus Timotheus ; VAN BAALEN, Marinus Willem</creatorcontrib><description>Certain aspects of the present disclosure provide techniques for performing mixed precision quantization of a machine learning model. In one example, a method includes determining a sensitivity value for each of one or more quantizers, wherein each quantizer is associated with one or more non-overlapping elements of a machine learning model architecture; and determining a bitwidth allocation for each of the one or more quantizers by solving an optimization problem defined by at least: an optimization objective of minimizing total sensitivity for the machine learning model architecture based on the bitwidth allocation; and one or more constraints. Certains aspects de la présente divulgation concernent des techniques pour effectuer une quantification de précision mixte d'un modèle d'apprentissage automatique. Dans un exemple, un procédé consiste à déterminer une valeur de sensibilité pour un quantificateur ou pour chacun de plusieurs quantificateurs, chaque quantificateur étant associé à un ou plusieurs éléments ne se chevauchant pas d'une architecture de modèle d'apprentissage automatique ; et à déterminer une attribution de largeur de bits pour le quantificateur ou pour chacun des quantificateurs en résolvant un problème d'optimisation défini par au moins : un objectif d'optimisation consistant à réduire à un minimum la sensibilité totale pour l'architecture de modèle d'apprentissage automatique sur la base de l'attribution de largeur de bits ; et une ou plusieurs contraintes.</description><language>eng ; fre</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240912&DB=EPODOC&CC=WO&NR=2024186332A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240912&DB=EPODOC&CC=WO&NR=2024186332A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>FOURNARAKIS, Marios</creatorcontrib><creatorcontrib>NAGEL, Markus</creatorcontrib><creatorcontrib>PETERS, Jorn Wilhelmus Timotheus</creatorcontrib><creatorcontrib>VAN BAALEN, Marinus Willem</creatorcontrib><title>MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION</title><description>Certain aspects of the present disclosure provide techniques for performing mixed precision quantization of a machine learning model. In one example, a method includes determining a sensitivity value for each of one or more quantizers, wherein each quantizer is associated with one or more non-overlapping elements of a machine learning model architecture; and determining a bitwidth allocation for each of the one or more quantizers by solving an optimization problem defined by at least: an optimization objective of minimizing total sensitivity for the machine learning model architecture based on the bitwidth allocation; and one or more constraints. Certains aspects de la présente divulgation concernent des techniques pour effectuer une quantification de précision mixte d'un modèle d'apprentissage automatique. Dans un exemple, un procédé consiste à déterminer une valeur de sensibilité pour un quantificateur ou pour chacun de plusieurs quantificateurs, chaque quantificateur étant associé à un ou plusieurs éléments ne se chevauchant pas d'une architecture de modèle d'apprentissage automatique ; et à déterminer une attribution de largeur de bits pour le quantificateur ou pour chacun des quantificateurs en résolvant un problème d'optimisation défini par au moins : un objectif d'optimisation consistant à réduire à un minimum la sensibilité totale pour l'architecture de modèle d'apprentissage automatique sur la base de l'attribution de largeur de bits ; et une ou plusieurs contraintes.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjM0KgkAURt20iOodLrQWUiPcXmamvODcMefa30YkplWUYO9PCT5Am-9w4PDNo2DpYnRc1UaRJ8dwbJCFbiijEINFVRAbKA3WTHyAxo9rnTYleMOehE4kV0DWoBx7qfHXa3CVkJ2OltHs0T2HsJq4iNZ7I6qIQ_9uw9B39_AKn_bs0k26TfJdlqWYZP9VX7ARNmM</recordid><startdate>20240912</startdate><enddate>20240912</enddate><creator>FOURNARAKIS, Marios</creator><creator>NAGEL, Markus</creator><creator>PETERS, Jorn Wilhelmus Timotheus</creator><creator>VAN BAALEN, Marinus Willem</creator><scope>EVB</scope></search><sort><creationdate>20240912</creationdate><title>MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION</title><author>FOURNARAKIS, Marios ; NAGEL, Markus ; PETERS, Jorn Wilhelmus Timotheus ; VAN BAALEN, Marinus Willem</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_WO2024186332A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>FOURNARAKIS, Marios</creatorcontrib><creatorcontrib>NAGEL, Markus</creatorcontrib><creatorcontrib>PETERS, Jorn Wilhelmus Timotheus</creatorcontrib><creatorcontrib>VAN BAALEN, Marinus Willem</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>FOURNARAKIS, Marios</au><au>NAGEL, Markus</au><au>PETERS, Jorn Wilhelmus Timotheus</au><au>VAN BAALEN, Marinus Willem</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION</title><date>2024-09-12</date><risdate>2024</risdate><abstract>Certain aspects of the present disclosure provide techniques for performing mixed precision quantization of a machine learning model. In one example, a method includes determining a sensitivity value for each of one or more quantizers, wherein each quantizer is associated with one or more non-overlapping elements of a machine learning model architecture; and determining a bitwidth allocation for each of the one or more quantizers by solving an optimization problem defined by at least: an optimization objective of minimizing total sensitivity for the machine learning model architecture based on the bitwidth allocation; and one or more constraints. Certains aspects de la présente divulgation concernent des techniques pour effectuer une quantification de précision mixte d'un modèle d'apprentissage automatique. Dans un exemple, un procédé consiste à déterminer une valeur de sensibilité pour un quantificateur ou pour chacun de plusieurs quantificateurs, chaque quantificateur étant associé à un ou plusieurs éléments ne se chevauchant pas d'une architecture de modèle d'apprentissage automatique ; et à déterminer une attribution de largeur de bits pour le quantificateur ou pour chacun des quantificateurs en résolvant un problème d'optimisation défini par au moins : un objectif d'optimisation consistant à réduire à un minimum la sensibilité totale pour l'architecture de modèle d'apprentissage automatique sur la base de l'attribution de largeur de bits ; et une ou plusieurs contraintes.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; fre
recordid	cdi_epo_espacenet_WO2024186332A1
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
title	MIXED-PRECISION QUANTIZATION IN MACHINE LEARNING USING MODEL SENSITIVITY AND CONSTRAINED OPTIMIZATION
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T14%3A20%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=FOURNARAKIS,%20Marios&rft.date=2024-09-12&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EWO2024186332A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true