Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters

The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-04
Hauptverfasser:	Sommerhoff, Hendrik, Agnihotri, Shashank, Saleh, Mohamed, Moeller, Michael, Keuper, Margret, Kolb, Andreas
Format:	Artikel
Sprache:	eng
Schlagworte:	Cameras Computer vision Deep learning High resolution Image processing Image resolution Image segmentation Layouts Machine learning Neural networks Parameterization Parameters Pixels Semantic segmentation Sensors Topology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Sommerhoff, Hendrik Agnihotri, Shashank Saleh, Mohamed Moeller, Michael Keuper, Margret Kolb, Andreas
description	The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2808089869</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2808089869</sourcerecordid><originalsourceid>FETCH-proquest_journals_28080898693</originalsourceid><addsrcrecordid>eNqNirEOgjAURRsTE4nyD02cm9QiCDNiHBhMYHAjFV9NEVt8LYN_bwc_wNzhnOTcBYlEkuxYvhdiRWLnBs65yA4iTZOIXI9aKUAwXsvbCLQB4yzSWn7s7B1VwStzZ96yAFqDRKPNg1pFW-merJmg10r3tJQvQEkvEoN4QLchSyVHB_GPa7I9VW15ZhPa9wzOd4Od0YTUiZyHFXlWJP-9vk4rQWo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2808089869</pqid></control><display><type>article</type><title>Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters</title><source>Free E- Journals</source><creator>Sommerhoff, Hendrik ; Agnihotri, Shashank ; Saleh, Mohamed ; Moeller, Michael ; Keuper, Margret ; Kolb, Andreas</creator><creatorcontrib>Sommerhoff, Hendrik ; Agnihotri, Shashank ; Saleh, Mohamed ; Moeller, Michael ; Keuper, Margret ; Kolb, Andreas</creatorcontrib><description>The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cameras ; Computer vision ; Deep learning ; High resolution ; Image processing ; Image resolution ; Image segmentation ; Layouts ; Machine learning ; Neural networks ; Parameterization ; Parameters ; Pixels ; Semantic segmentation ; Sensors ; Topology</subject><ispartof>arXiv.org, 2023-04</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Sommerhoff, Hendrik</creatorcontrib><creatorcontrib>Agnihotri, Shashank</creatorcontrib><creatorcontrib>Saleh, Mohamed</creatorcontrib><creatorcontrib>Moeller, Michael</creatorcontrib><creatorcontrib>Keuper, Margret</creatorcontrib><creatorcontrib>Kolb, Andreas</creatorcontrib><title>Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters</title><title>arXiv.org</title><description>The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.</description><subject>Cameras</subject><subject>Computer vision</subject><subject>Deep learning</subject><subject>High resolution</subject><subject>Image processing</subject><subject>Image resolution</subject><subject>Image segmentation</subject><subject>Layouts</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Parameterization</subject><subject>Parameters</subject><subject>Pixels</subject><subject>Semantic segmentation</subject><subject>Sensors</subject><subject>Topology</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNirEOgjAURRsTE4nyD02cm9QiCDNiHBhMYHAjFV9NEVt8LYN_bwc_wNzhnOTcBYlEkuxYvhdiRWLnBs65yA4iTZOIXI9aKUAwXsvbCLQB4yzSWn7s7B1VwStzZ96yAFqDRKPNg1pFW-merJmg10r3tJQvQEkvEoN4QLchSyVHB_GPa7I9VW15ZhPa9wzOd4Od0YTUiZyHFXlWJP-9vk4rQWo</recordid><startdate>20230428</startdate><enddate>20230428</enddate><creator>Sommerhoff, Hendrik</creator><creator>Agnihotri, Shashank</creator><creator>Saleh, Mohamed</creator><creator>Moeller, Michael</creator><creator>Keuper, Margret</creator><creator>Kolb, Andreas</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230428</creationdate><title>Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters</title><author>Sommerhoff, Hendrik ; Agnihotri, Shashank ; Saleh, Mohamed ; Moeller, Michael ; Keuper, Margret ; Kolb, Andreas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28080898693</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cameras</topic><topic>Computer vision</topic><topic>Deep learning</topic><topic>High resolution</topic><topic>Image processing</topic><topic>Image resolution</topic><topic>Image segmentation</topic><topic>Layouts</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Parameterization</topic><topic>Parameters</topic><topic>Pixels</topic><topic>Semantic segmentation</topic><topic>Sensors</topic><topic>Topology</topic><toplevel>online_resources</toplevel><creatorcontrib>Sommerhoff, Hendrik</creatorcontrib><creatorcontrib>Agnihotri, Shashank</creatorcontrib><creatorcontrib>Saleh, Mohamed</creatorcontrib><creatorcontrib>Moeller, Michael</creatorcontrib><creatorcontrib>Keuper, Margret</creatorcontrib><creatorcontrib>Kolb, Andreas</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sommerhoff, Hendrik</au><au>Agnihotri, Shashank</au><au>Saleh, Mohamed</au><au>Moeller, Michael</au><au>Keuper, Margret</au><au>Kolb, Andreas</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters</atitle><jtitle>arXiv.org</jtitle><date>2023-04-28</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-04
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2808089869
source	Free E- Journals
subjects	Cameras Computer vision Deep learning High resolution Image processing Image resolution Image segmentation Layouts Machine learning Neural networks Parameterization Parameters Pixels Semantic segmentation Sensors Topology
title	Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T12%3A06%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Differentiable%20Sensor%20Layouts%20for%20End-to-End%20Learning%20of%20Task-Specific%20Camera%20Parameters&rft.jtitle=arXiv.org&rft.au=Sommerhoff,%20Hendrik&rft.date=2023-04-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2808089869%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2808089869&rft_id=info:pmid/&rfr_iscdi=true