Gradient-Free Neural Network Training on the Edge

Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. H...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Dotan Di Castro, Joglekar, Omkar, Kozlovsky, Shir, Tchuiev, Vladimir, Moshkovitz, Michal
Format:	Artikel
Sprache:	eng
Schlagworte:	Neural networks Rounding
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Dotan Di Castro Joglekar, Omkar Kozlovsky, Shir Tchuiev, Vladimir Moshkovitz, Michal
description	Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. However, most of these techniques still require full gradient precision at training time, which makes training such models prohibitive on edge devices. This work presents a novel technique for training neural networks without needing gradients. This enables a training process where all the weights are one or two bits, without any hidden full precision computations. We show that it is possible to train models without gradient-based optimization techniques by identifying erroneous contributions of each neuron towards the expected classification and flipping the relevant bits using logical operations. We tested our method on several standard datasets and achieved performance comparable to corresponding gradient-based baselines with a fraction of the compute power.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3116752782</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3116752782</sourcerecordid><originalsourceid>FETCH-proquest_journals_31167527823</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwdC9KTMlMzSvRdStKTVXwSy0tSswBUiXl-UXZCiFFiZl5mXnpCvl5CiUZqQquKempPAysaYk5xam8UJqbQdnNNcTZQ7egKL-wNLW4JD4rv7QoDygVb2xoaGZuamRuYWRMnCoAWLAyxw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3116752782</pqid></control><display><type>article</type><title>Gradient-Free Neural Network Training on the Edge</title><source>Free E- Journals</source><creator>Dotan Di Castro ; Joglekar, Omkar ; Kozlovsky, Shir ; Tchuiev, Vladimir ; Moshkovitz, Michal</creator><creatorcontrib>Dotan Di Castro ; Joglekar, Omkar ; Kozlovsky, Shir ; Tchuiev, Vladimir ; Moshkovitz, Michal</creatorcontrib><description>Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. However, most of these techniques still require full gradient precision at training time, which makes training such models prohibitive on edge devices. This work presents a novel technique for training neural networks without needing gradients. This enables a training process where all the weights are one or two bits, without any hidden full precision computations. We show that it is possible to train models without gradient-based optimization techniques by identifying erroneous contributions of each neuron towards the expected classification and flipping the relevant bits using logical operations. We tested our method on several standard datasets and achieved performance comparable to corresponding gradient-based baselines with a fraction of the compute power.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Neural networks ; Rounding</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>777,781</link.rule.ids></links><search><creatorcontrib>Dotan Di Castro</creatorcontrib><creatorcontrib>Joglekar, Omkar</creatorcontrib><creatorcontrib>Kozlovsky, Shir</creatorcontrib><creatorcontrib>Tchuiev, Vladimir</creatorcontrib><creatorcontrib>Moshkovitz, Michal</creatorcontrib><title>Gradient-Free Neural Network Training on the Edge</title><title>arXiv.org</title><description>Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. However, most of these techniques still require full gradient precision at training time, which makes training such models prohibitive on edge devices. This work presents a novel technique for training neural networks without needing gradients. This enables a training process where all the weights are one or two bits, without any hidden full precision computations. We show that it is possible to train models without gradient-based optimization techniques by identifying erroneous contributions of each neuron towards the expected classification and flipping the relevant bits using logical operations. We tested our method on several standard datasets and achieved performance comparable to corresponding gradient-based baselines with a fraction of the compute power.</description><subject>Neural networks</subject><subject>Rounding</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwdC9KTMlMzSvRdStKTVXwSy0tSswBUiXl-UXZCiFFiZl5mXnpCvl5CiUZqQquKempPAysaYk5xam8UJqbQdnNNcTZQ7egKL-wNLW4JD4rv7QoDygVb2xoaGZuamRuYWRMnCoAWLAyxw</recordid><startdate>20241013</startdate><enddate>20241013</enddate><creator>Dotan Di Castro</creator><creator>Joglekar, Omkar</creator><creator>Kozlovsky, Shir</creator><creator>Tchuiev, Vladimir</creator><creator>Moshkovitz, Michal</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241013</creationdate><title>Gradient-Free Neural Network Training on the Edge</title><author>Dotan Di Castro ; Joglekar, Omkar ; Kozlovsky, Shir ; Tchuiev, Vladimir ; Moshkovitz, Michal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31167527823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Neural networks</topic><topic>Rounding</topic><toplevel>online_resources</toplevel><creatorcontrib>Dotan Di Castro</creatorcontrib><creatorcontrib>Joglekar, Omkar</creatorcontrib><creatorcontrib>Kozlovsky, Shir</creatorcontrib><creatorcontrib>Tchuiev, Vladimir</creatorcontrib><creatorcontrib>Moshkovitz, Michal</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dotan Di Castro</au><au>Joglekar, Omkar</au><au>Kozlovsky, Shir</au><au>Tchuiev, Vladimir</au><au>Moshkovitz, Michal</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Gradient-Free Neural Network Training on the Edge</atitle><jtitle>arXiv.org</jtitle><date>2024-10-13</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. However, most of these techniques still require full gradient precision at training time, which makes training such models prohibitive on edge devices. This work presents a novel technique for training neural networks without needing gradients. This enables a training process where all the weights are one or two bits, without any hidden full precision computations. We show that it is possible to train models without gradient-based optimization techniques by identifying erroneous contributions of each neuron towards the expected classification and flipping the relevant bits using logical operations. We tested our method on several standard datasets and achieved performance comparable to corresponding gradient-based baselines with a fraction of the compute power.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3116752782
source	Free E- Journals
subjects	Neural networks Rounding
title	Gradient-Free Neural Network Training on the Edge
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T14%3A22%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Gradient-Free%20Neural%20Network%20Training%20on%20the%20Edge&rft.jtitle=arXiv.org&rft.au=Dotan%20Di%20Castro&rft.date=2024-10-13&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3116752782%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3116752782&rft_id=info:pmid/&rfr_iscdi=true