Resolution learning in deep convolutional networks using scale-space theory

Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolutio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-10
Hauptverfasser:	Pintea, Silvia L, Tomen, Nergis, Goes, Stanley F, Loog, Marco, van Gemert, Jan C
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Computer architecture Computer Science - Computer Vision and Pattern Recognition Feature maps Learning Optimization Parameterization Parameters Segmentation Self-similarity Taylor series
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Pintea, Silvia L Tomen, Nergis Goes, Stanley F Loog, Marco van Gemert, Jan C
description	Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.
doi_str_mv	10.48550/arxiv.2106.03412
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2106_03412</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2538876673</sourcerecordid><originalsourceid>FETCH-LOGICAL-a952-1ebcecc33dd5088e272e0dcb89415a4cc7476be9807437f8938354ffd00010be3</originalsourceid><addsrcrecordid>eNotj11LwzAYhYMgOOZ-gFcGvG5N8iZNeinDj-FAkN2XNH2rnbWpSTvdv7f7uDoX5-FwHkJuOEulUYrd2_DX7FLBWZYykFxckJkA4ImRQlyRRYxbxpjItFAKZuT1HaNvx6HxHW3Rhq7pPmjT0Qqxp853u3NpW9rh8OvDV6RjPEDR2RaT2FuHdPhEH_bX5LK2bcTFOedk8_S4Wb4k67fn1fJhndhciYRj6dA5gKpSzBgUWiCrXGlyyZWVzmmpsxJzw7QEXZscDChZ19V0m7MSYU5uT7NH06IPzbcN--JgXByNJ-LuRPTB_4wYh2LrxzA5xEIoMEZnmQb4BzLdWbQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2538876673</pqid></control><display><type>article</type><title>Resolution learning in deep convolutional networks using scale-space theory</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</creator><creatorcontrib>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</creatorcontrib><description>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2106.03412</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Computer architecture ; Computer Science - Computer Vision and Pattern Recognition ; Feature maps ; Learning ; Optimization ; Parameterization ; Parameters ; Segmentation ; Self-similarity ; Taylor series</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1109/TIP.2021.3115001$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2106.03412$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pintea, Silvia L</creatorcontrib><creatorcontrib>Tomen, Nergis</creatorcontrib><creatorcontrib>Goes, Stanley F</creatorcontrib><creatorcontrib>Loog, Marco</creatorcontrib><creatorcontrib>van Gemert, Jan C</creatorcontrib><title>Resolution learning in deep convolutional networks using scale-space theory</title><title>arXiv.org</title><description>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</description><subject>Artificial neural networks</subject><subject>Computer architecture</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Feature maps</subject><subject>Learning</subject><subject>Optimization</subject><subject>Parameterization</subject><subject>Parameters</subject><subject>Segmentation</subject><subject>Self-similarity</subject><subject>Taylor series</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYhYMgOOZ-gFcGvG5N8iZNeinDj-FAkN2XNH2rnbWpSTvdv7f7uDoX5-FwHkJuOEulUYrd2_DX7FLBWZYykFxckJkA4ImRQlyRRYxbxpjItFAKZuT1HaNvx6HxHW3Rhq7pPmjT0Qqxp853u3NpW9rh8OvDV6RjPEDR2RaT2FuHdPhEH_bX5LK2bcTFOedk8_S4Wb4k67fn1fJhndhciYRj6dA5gKpSzBgUWiCrXGlyyZWVzmmpsxJzw7QEXZscDChZ19V0m7MSYU5uT7NH06IPzbcN--JgXByNJ-LuRPTB_4wYh2LrxzA5xEIoMEZnmQb4BzLdWbQ</recordid><startdate>20231024</startdate><enddate>20231024</enddate><creator>Pintea, Silvia L</creator><creator>Tomen, Nergis</creator><creator>Goes, Stanley F</creator><creator>Loog, Marco</creator><creator>van Gemert, Jan C</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231024</creationdate><title>Resolution learning in deep convolutional networks using scale-space theory</title><author>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a952-1ebcecc33dd5088e272e0dcb89415a4cc7476be9807437f8938354ffd00010be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Computer architecture</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Feature maps</topic><topic>Learning</topic><topic>Optimization</topic><topic>Parameterization</topic><topic>Parameters</topic><topic>Segmentation</topic><topic>Self-similarity</topic><topic>Taylor series</topic><toplevel>online_resources</toplevel><creatorcontrib>Pintea, Silvia L</creatorcontrib><creatorcontrib>Tomen, Nergis</creatorcontrib><creatorcontrib>Goes, Stanley F</creatorcontrib><creatorcontrib>Loog, Marco</creatorcontrib><creatorcontrib>van Gemert, Jan C</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pintea, Silvia L</au><au>Tomen, Nergis</au><au>Goes, Stanley F</au><au>Loog, Marco</au><au>van Gemert, Jan C</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Resolution learning in deep convolutional networks using scale-space theory</atitle><jtitle>arXiv.org</jtitle><date>2023-10-24</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2106.03412</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-10
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2106_03412
source	arXiv.org; Free E- Journals
subjects	Artificial neural networks Computer architecture Computer Science - Computer Vision and Pattern Recognition Feature maps Learning Optimization Parameterization Parameters Segmentation Self-similarity Taylor series
title	Resolution learning in deep convolutional networks using scale-space theory
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T23%3A03%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Resolution%20learning%20in%20deep%20convolutional%20networks%20using%20scale-space%20theory&rft.jtitle=arXiv.org&rft.au=Pintea,%20Silvia%20L&rft.date=2023-10-24&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2106.03412&rft_dat=%3Cproquest_arxiv%3E2538876673%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2538876673&rft_id=info:pmid/&rfr_iscdi=true