Learning Activation Functions in Deep (Spline) Neural Networks

We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainab...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE open journal of signal processing 2020, Vol.1, p.295-309
Hauptverfasser: Bohra, Pakshal, Campos, Joaquim, Gupta, Harshit, Aziznejad, Shayan, Unser, Michael
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 309
container_issue
container_start_page 295
container_title IEEE open journal of signal processing
container_volume 1
creator Bohra, Pakshal
Campos, Joaquim
Gupta, Harshit
Aziznejad, Shayan
Unser, Michael
description We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainable nonlinearities. The representer theorem for DNNs tells us that the optimal activation functions are adaptive piecewise-linear splines, which allows us to recast the problem as a parametric optimization. The challenging point is that the corresponding basis functions (ReLUs) are poorly conditioned and that the determination of their number and positioning is also part of the problem. We circumvent the difficulty by using an equivalent B-spline basis to encode the activation functions and by expressing the regularization as an ℓ 1 -penalty. This results in the specification of parametric activation function modules that can be implemented and optimized efficiently on standard development platforms. We present experimental results that demonstrate the benefit of our approach.
doi_str_mv 10.1109/OJSP.2020.3039379
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_OJSP_2020_3039379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9264754</ieee_id><doaj_id>oai_doaj_org_article_979ba3bb38654d39a0ef599f579c2e81</doaj_id><sourcerecordid>2532110431</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-c62e66a064e97599b427412f284ad5372c055917e2a650fc55f27afbfef1108c3</originalsourceid><addsrcrecordid>eNpNkEFLAzEQhRdRsNT-APGy4EUPrckk2TQXoVSrlWKF6jlk00lJrbs1u6v4783aUjzNMLx5b-ZLknNKBpQSdTN_WrwMgAAZMMIUk-oo6UDGeZ8ygON__WnSq6o1IQQEpXHQSW5naELhi1U6srX_MrUvi3TSFLZtqtQX6R3iNr1abDe-wOv0GZtgNrHU32V4r86SE2c2Ffb2tZu8Te5fx4_92fxhOh7N-pYTqPs2A8wyQzKOSgqlcg6SU3Aw5GYpmARLhFBUIphMEGeFcCCNyx26-ODQsm4y3fkuS7PW2-A_TPjRpfH6b1CGlTah9naDWkmVG5bnbJgJvmTKEHQx0gmpLOCQRq_Lndc2lJ8NVrVel00o4vkaBIMYyFmrojuVDWVVBXSHVEp0S1231HVLXe-px52L3Y5HxINeRfxScPYLU3R60A</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2532110431</pqid></control><display><type>article</type><title>Learning Activation Functions in Deep (Spline) Neural Networks</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Bohra, Pakshal ; Campos, Joaquim ; Gupta, Harshit ; Aziznejad, Shayan ; Unser, Michael</creator><creatorcontrib>Bohra, Pakshal ; Campos, Joaquim ; Gupta, Harshit ; Aziznejad, Shayan ; Unser, Michael</creatorcontrib><description>We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainable nonlinearities. The representer theorem for DNNs tells us that the optimal activation functions are adaptive piecewise-linear splines, which allows us to recast the problem as a parametric optimization. The challenging point is that the corresponding basis functions (ReLUs) are poorly conditioned and that the determination of their number and positioning is also part of the problem. We circumvent the difficulty by using an equivalent B-spline basis to encode the activation functions and by expressing the regularization as an ℓ 1 -penalty. This results in the specification of parametric activation function modules that can be implemented and optimized efficiently on standard development platforms. We present experimental results that demonstrate the benefit of our approach.</description><identifier>ISSN: 2644-1322</identifier><identifier>EISSN: 2644-1322</identifier><identifier>DOI: 10.1109/OJSP.2020.3039379</identifier><identifier>CODEN: IOJSAF</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Activation functions ; Artificial neural networks ; B spline functions ; B-splines ; Basis functions ; Computational efficiency ; Deep learning ; Free form ; Functionals ; Machine learning ; Neural networks ; Neurons ; Optimization ; Regularization ; sparsity ; Splines (mathematics) ; Training</subject><ispartof>IEEE open journal of signal processing, 2020, Vol.1, p.295-309</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-c62e66a064e97599b427412f284ad5372c055917e2a650fc55f27afbfef1108c3</citedby><cites>FETCH-LOGICAL-c402t-c62e66a064e97599b427412f284ad5372c055917e2a650fc55f27afbfef1108c3</cites><orcidid>0000-0003-1248-2513 ; 0000-0002-6871-9703 ; 0000-0002-2611-3834 ; 0000-0002-2910-9072 ; 0000-0002-6723-0295</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9264754$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,862,2098,4012,27616,27906,27907,27908,54916</link.rule.ids></links><search><creatorcontrib>Bohra, Pakshal</creatorcontrib><creatorcontrib>Campos, Joaquim</creatorcontrib><creatorcontrib>Gupta, Harshit</creatorcontrib><creatorcontrib>Aziznejad, Shayan</creatorcontrib><creatorcontrib>Unser, Michael</creatorcontrib><title>Learning Activation Functions in Deep (Spline) Neural Networks</title><title>IEEE open journal of signal processing</title><addtitle>OJSP</addtitle><description>We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainable nonlinearities. The representer theorem for DNNs tells us that the optimal activation functions are adaptive piecewise-linear splines, which allows us to recast the problem as a parametric optimization. The challenging point is that the corresponding basis functions (ReLUs) are poorly conditioned and that the determination of their number and positioning is also part of the problem. We circumvent the difficulty by using an equivalent B-spline basis to encode the activation functions and by expressing the regularization as an ℓ 1 -penalty. This results in the specification of parametric activation function modules that can be implemented and optimized efficiently on standard development platforms. We present experimental results that demonstrate the benefit of our approach.</description><subject>Activation functions</subject><subject>Artificial neural networks</subject><subject>B spline functions</subject><subject>B-splines</subject><subject>Basis functions</subject><subject>Computational efficiency</subject><subject>Deep learning</subject><subject>Free form</subject><subject>Functionals</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Neurons</subject><subject>Optimization</subject><subject>Regularization</subject><subject>sparsity</subject><subject>Splines (mathematics)</subject><subject>Training</subject><issn>2644-1322</issn><issn>2644-1322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNkEFLAzEQhRdRsNT-APGy4EUPrckk2TQXoVSrlWKF6jlk00lJrbs1u6v4783aUjzNMLx5b-ZLknNKBpQSdTN_WrwMgAAZMMIUk-oo6UDGeZ8ygON__WnSq6o1IQQEpXHQSW5naELhi1U6srX_MrUvi3TSFLZtqtQX6R3iNr1abDe-wOv0GZtgNrHU32V4r86SE2c2Ffb2tZu8Te5fx4_92fxhOh7N-pYTqPs2A8wyQzKOSgqlcg6SU3Aw5GYpmARLhFBUIphMEGeFcCCNyx26-ODQsm4y3fkuS7PW2-A_TPjRpfH6b1CGlTah9naDWkmVG5bnbJgJvmTKEHQx0gmpLOCQRq_Lndc2lJ8NVrVel00o4vkaBIMYyFmrojuVDWVVBXSHVEp0S1231HVLXe-px52L3Y5HxINeRfxScPYLU3R60A</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Bohra, Pakshal</creator><creator>Campos, Joaquim</creator><creator>Gupta, Harshit</creator><creator>Aziznejad, Shayan</creator><creator>Unser, Michael</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-1248-2513</orcidid><orcidid>https://orcid.org/0000-0002-6871-9703</orcidid><orcidid>https://orcid.org/0000-0002-2611-3834</orcidid><orcidid>https://orcid.org/0000-0002-2910-9072</orcidid><orcidid>https://orcid.org/0000-0002-6723-0295</orcidid></search><sort><creationdate>2020</creationdate><title>Learning Activation Functions in Deep (Spline) Neural Networks</title><author>Bohra, Pakshal ; Campos, Joaquim ; Gupta, Harshit ; Aziznejad, Shayan ; Unser, Michael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-c62e66a064e97599b427412f284ad5372c055917e2a650fc55f27afbfef1108c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Activation functions</topic><topic>Artificial neural networks</topic><topic>B spline functions</topic><topic>B-splines</topic><topic>Basis functions</topic><topic>Computational efficiency</topic><topic>Deep learning</topic><topic>Free form</topic><topic>Functionals</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Neurons</topic><topic>Optimization</topic><topic>Regularization</topic><topic>sparsity</topic><topic>Splines (mathematics)</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bohra, Pakshal</creatorcontrib><creatorcontrib>Campos, Joaquim</creatorcontrib><creatorcontrib>Gupta, Harshit</creatorcontrib><creatorcontrib>Aziznejad, Shayan</creatorcontrib><creatorcontrib>Unser, Michael</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE open journal of signal processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bohra, Pakshal</au><au>Campos, Joaquim</au><au>Gupta, Harshit</au><au>Aziznejad, Shayan</au><au>Unser, Michael</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Activation Functions in Deep (Spline) Neural Networks</atitle><jtitle>IEEE open journal of signal processing</jtitle><stitle>OJSP</stitle><date>2020</date><risdate>2020</risdate><volume>1</volume><spage>295</spage><epage>309</epage><pages>295-309</pages><issn>2644-1322</issn><eissn>2644-1322</eissn><coden>IOJSAF</coden><abstract>We develop an efficient computational solution to train deep neural networks (DNN) with free-form activation functions. To make the problem well-posed, we augment the cost functional of the DNN by adding an appropriate shape regularization: the sum of the second-order total-variations of the trainable nonlinearities. The representer theorem for DNNs tells us that the optimal activation functions are adaptive piecewise-linear splines, which allows us to recast the problem as a parametric optimization. The challenging point is that the corresponding basis functions (ReLUs) are poorly conditioned and that the determination of their number and positioning is also part of the problem. We circumvent the difficulty by using an equivalent B-spline basis to encode the activation functions and by expressing the regularization as an ℓ 1 -penalty. This results in the specification of parametric activation function modules that can be implemented and optimized efficiently on standard development platforms. We present experimental results that demonstrate the benefit of our approach.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/OJSP.2020.3039379</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1248-2513</orcidid><orcidid>https://orcid.org/0000-0002-6871-9703</orcidid><orcidid>https://orcid.org/0000-0002-2611-3834</orcidid><orcidid>https://orcid.org/0000-0002-2910-9072</orcidid><orcidid>https://orcid.org/0000-0002-6723-0295</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2644-1322
ispartof IEEE open journal of signal processing, 2020, Vol.1, p.295-309
issn 2644-1322
2644-1322
language eng
recordid cdi_crossref_primary_10_1109_OJSP_2020_3039379
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Activation functions
Artificial neural networks
B spline functions
B-splines
Basis functions
Computational efficiency
Deep learning
Free form
Functionals
Machine learning
Neural networks
Neurons
Optimization
Regularization
sparsity
Splines (mathematics)
Training
title Learning Activation Functions in Deep (Spline) Neural Networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T01%3A05%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Activation%20Functions%20in%20Deep%20(Spline)%20Neural%20Networks&rft.jtitle=IEEE%20open%20journal%20of%20signal%20processing&rft.au=Bohra,%20Pakshal&rft.date=2020&rft.volume=1&rft.spage=295&rft.epage=309&rft.pages=295-309&rft.issn=2644-1322&rft.eissn=2644-1322&rft.coden=IOJSAF&rft_id=info:doi/10.1109/OJSP.2020.3039379&rft_dat=%3Cproquest_cross%3E2532110431%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2532110431&rft_id=info:pmid/&rft_ieee_id=9264754&rft_doaj_id=oai_doaj_org_article_979ba3bb38654d39a0ef599f579c2e81&rfr_iscdi=true