Resolution learning in deep convolutional networks using scale-space theory

Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolutio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-10
Hauptverfasser: Pintea, Silvia L, Tomen, Nergis, Goes, Stanley F, Loog, Marco, van Gemert, Jan C
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Pintea, Silvia L
Tomen, Nergis
Goes, Stanley F
Loog, Marco
van Gemert, Jan C
description Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.
doi_str_mv 10.48550/arxiv.2106.03412
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2106_03412</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2538876673</sourcerecordid><originalsourceid>FETCH-LOGICAL-a952-1ebcecc33dd5088e272e0dcb89415a4cc7476be9807437f8938354ffd00010be3</originalsourceid><addsrcrecordid>eNotj11LwzAYhYMgOOZ-gFcGvG5N8iZNeinDj-FAkN2XNH2rnbWpSTvdv7f7uDoX5-FwHkJuOEulUYrd2_DX7FLBWZYykFxckJkA4ImRQlyRRYxbxpjItFAKZuT1HaNvx6HxHW3Rhq7pPmjT0Qqxp853u3NpW9rh8OvDV6RjPEDR2RaT2FuHdPhEH_bX5LK2bcTFOedk8_S4Wb4k67fn1fJhndhciYRj6dA5gKpSzBgUWiCrXGlyyZWVzmmpsxJzw7QEXZscDChZ19V0m7MSYU5uT7NH06IPzbcN--JgXByNJ-LuRPTB_4wYh2LrxzA5xEIoMEZnmQb4BzLdWbQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2538876673</pqid></control><display><type>article</type><title>Resolution learning in deep convolutional networks using scale-space theory</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</creator><creatorcontrib>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</creatorcontrib><description>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2106.03412</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Computer architecture ; Computer Science - Computer Vision and Pattern Recognition ; Feature maps ; Learning ; Optimization ; Parameterization ; Parameters ; Segmentation ; Self-similarity ; Taylor series</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1109/TIP.2021.3115001$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2106.03412$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pintea, Silvia L</creatorcontrib><creatorcontrib>Tomen, Nergis</creatorcontrib><creatorcontrib>Goes, Stanley F</creatorcontrib><creatorcontrib>Loog, Marco</creatorcontrib><creatorcontrib>van Gemert, Jan C</creatorcontrib><title>Resolution learning in deep convolutional networks using scale-space theory</title><title>arXiv.org</title><description>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</description><subject>Artificial neural networks</subject><subject>Computer architecture</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Feature maps</subject><subject>Learning</subject><subject>Optimization</subject><subject>Parameterization</subject><subject>Parameters</subject><subject>Segmentation</subject><subject>Self-similarity</subject><subject>Taylor series</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYhYMgOOZ-gFcGvG5N8iZNeinDj-FAkN2XNH2rnbWpSTvdv7f7uDoX5-FwHkJuOEulUYrd2_DX7FLBWZYykFxckJkA4ImRQlyRRYxbxpjItFAKZuT1HaNvx6HxHW3Rhq7pPmjT0Qqxp853u3NpW9rh8OvDV6RjPEDR2RaT2FuHdPhEH_bX5LK2bcTFOedk8_S4Wb4k67fn1fJhndhciYRj6dA5gKpSzBgUWiCrXGlyyZWVzmmpsxJzw7QEXZscDChZ19V0m7MSYU5uT7NH06IPzbcN--JgXByNJ-LuRPTB_4wYh2LrxzA5xEIoMEZnmQb4BzLdWbQ</recordid><startdate>20231024</startdate><enddate>20231024</enddate><creator>Pintea, Silvia L</creator><creator>Tomen, Nergis</creator><creator>Goes, Stanley F</creator><creator>Loog, Marco</creator><creator>van Gemert, Jan C</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231024</creationdate><title>Resolution learning in deep convolutional networks using scale-space theory</title><author>Pintea, Silvia L ; Tomen, Nergis ; Goes, Stanley F ; Loog, Marco ; van Gemert, Jan C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a952-1ebcecc33dd5088e272e0dcb89415a4cc7476be9807437f8938354ffd00010be3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Computer architecture</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Feature maps</topic><topic>Learning</topic><topic>Optimization</topic><topic>Parameterization</topic><topic>Parameters</topic><topic>Segmentation</topic><topic>Self-similarity</topic><topic>Taylor series</topic><toplevel>online_resources</toplevel><creatorcontrib>Pintea, Silvia L</creatorcontrib><creatorcontrib>Tomen, Nergis</creatorcontrib><creatorcontrib>Goes, Stanley F</creatorcontrib><creatorcontrib>Loog, Marco</creatorcontrib><creatorcontrib>van Gemert, Jan C</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pintea, Silvia L</au><au>Tomen, Nergis</au><au>Goes, Stanley F</au><au>Loog, Marco</au><au>van Gemert, Jan C</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Resolution learning in deep convolutional networks using scale-space theory</atitle><jtitle>arXiv.org</jtitle><date>2023-10-24</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter sigma of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since sigma is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning sigma is especially beneficial for inputs at multiple sizes.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2106.03412</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-10
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2106_03412
source arXiv.org; Free E- Journals
subjects Artificial neural networks
Computer architecture
Computer Science - Computer Vision and Pattern Recognition
Feature maps
Learning
Optimization
Parameterization
Parameters
Segmentation
Self-similarity
Taylor series
title Resolution learning in deep convolutional networks using scale-space theory
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T23%3A03%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Resolution%20learning%20in%20deep%20convolutional%20networks%20using%20scale-space%20theory&rft.jtitle=arXiv.org&rft.au=Pintea,%20Silvia%20L&rft.date=2023-10-24&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2106.03412&rft_dat=%3Cproquest_arxiv%3E2538876673%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2538876673&rft_id=info:pmid/&rfr_iscdi=true