Single-image SVBRDF capture with a rendering-aware deep network

Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in compu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on graphics 2018, Vol.37 (4), p.1-15
Hauptverfasser:	Deschaintre, Valentin, Aittala, Miika, Durand, Fredo, Drettakis, George, Bousseau, Adrien
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science Image Processing Neural and Evolutionary Computing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	15
container_issue	4
container_start_page	1
container_title	ACM transactions on graphics
container_volume	37
creator	Deschaintre, Valentin Aittala, Miika Durand, Fredo Drettakis, George Bousseau, Adrien
description	Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.
doi_str_mv	10.1145/3197517.3201378
format	Article
fullrecord	<record><control><sourceid>hal_cross</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01793826v2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_HAL_hal_01793826v2</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-5a7e74a8d6d01696a9c39b62cd3f01781da116e1d97eb0d573e2729a3f11f89b3</originalsourceid><addsrcrecordid>eNo9kEFLw0AQhRdRMFbPXnP1sO1MJtnNnqRWa4WAYNXrss1O2mhsyyZa_PemtHgaeHzvMXxCXCMMEdNsRGh0hnpICSDp_EREmGVaalL5qYhAE0ggwHNx0bYfAKDSVEXidl6vlw3L-sstOZ6_373cT-PSbbvvwPGu7laxiwOvPYeek27n-tgzb-M1d7tN-LwUZ5VrWr463oF4mz68TmayeH58mowLWRKqTmZOs05d7pUHVEY5U5JZqKT0VAHqHL1DVIzeaF6AzzRxohPjqEKscrOggbg57K5cY7ehfzf82o2r7Wxc2H3WrxjKE_WT9OzowJZh07aBq_8Cgt27skdX9uiK_gCdlVnD</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Single-image SVBRDF capture with a rendering-aware deep network</title><source>ACM Digital Library Complete</source><creator>Deschaintre, Valentin ; Aittala, Miika ; Durand, Fredo ; Drettakis, George ; Bousseau, Adrien</creator><creatorcontrib>Deschaintre, Valentin ; Aittala, Miika ; Durand, Fredo ; Drettakis, George ; Bousseau, Adrien</creatorcontrib><description>Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.</description><identifier>ISSN: 0730-0301</identifier><identifier>EISSN: 1557-7368</identifier><identifier>DOI: 10.1145/3197517.3201378</identifier><language>eng</language><publisher>Association for Computing Machinery</publisher><subject>Computer Science ; Image Processing ; Neural and Evolutionary Computing</subject><ispartof>ACM transactions on graphics, 2018, Vol.37 (4), p.1-15</ispartof><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-5a7e74a8d6d01696a9c39b62cd3f01781da116e1d97eb0d573e2729a3f11f89b3</citedby><cites>FETCH-LOGICAL-c316t-5a7e74a8d6d01696a9c39b62cd3f01781da116e1d97eb0d573e2729a3f11f89b3</cites><orcidid>0000-0002-9254-4819 ; 0000-0002-8003-9575</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,4009,27902,27903,27904</link.rule.ids><backlink>$$Uhttps://inria.hal.science/hal-01793826$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Deschaintre, Valentin</creatorcontrib><creatorcontrib>Aittala, Miika</creatorcontrib><creatorcontrib>Durand, Fredo</creatorcontrib><creatorcontrib>Drettakis, George</creatorcontrib><creatorcontrib>Bousseau, Adrien</creatorcontrib><title>Single-image SVBRDF capture with a rendering-aware deep network</title><title>ACM transactions on graphics</title><description>Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.</description><subject>Computer Science</subject><subject>Image Processing</subject><subject>Neural and Evolutionary Computing</subject><issn>0730-0301</issn><issn>1557-7368</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9kEFLw0AQhRdRMFbPXnP1sO1MJtnNnqRWa4WAYNXrss1O2mhsyyZa_PemtHgaeHzvMXxCXCMMEdNsRGh0hnpICSDp_EREmGVaalL5qYhAE0ggwHNx0bYfAKDSVEXidl6vlw3L-sstOZ6_373cT-PSbbvvwPGu7laxiwOvPYeek27n-tgzb-M1d7tN-LwUZ5VrWr463oF4mz68TmayeH58mowLWRKqTmZOs05d7pUHVEY5U5JZqKT0VAHqHL1DVIzeaF6AzzRxohPjqEKscrOggbg57K5cY7ehfzf82o2r7Wxc2H3WrxjKE_WT9OzowJZh07aBq_8Cgt27skdX9uiK_gCdlVnD</recordid><startdate>2018</startdate><enddate>2018</enddate><creator>Deschaintre, Valentin</creator><creator>Aittala, Miika</creator><creator>Durand, Fredo</creator><creator>Drettakis, George</creator><creator>Bousseau, Adrien</creator><general>Association for Computing Machinery</general><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-9254-4819</orcidid><orcidid>https://orcid.org/0000-0002-8003-9575</orcidid></search><sort><creationdate>2018</creationdate><title>Single-image SVBRDF capture with a rendering-aware deep network</title><author>Deschaintre, Valentin ; Aittala, Miika ; Durand, Fredo ; Drettakis, George ; Bousseau, Adrien</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-5a7e74a8d6d01696a9c39b62cd3f01781da116e1d97eb0d573e2729a3f11f89b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science</topic><topic>Image Processing</topic><topic>Neural and Evolutionary Computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Deschaintre, Valentin</creatorcontrib><creatorcontrib>Aittala, Miika</creatorcontrib><creatorcontrib>Durand, Fredo</creatorcontrib><creatorcontrib>Drettakis, George</creatorcontrib><creatorcontrib>Bousseau, Adrien</creatorcontrib><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>ACM transactions on graphics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Deschaintre, Valentin</au><au>Aittala, Miika</au><au>Durand, Fredo</au><au>Drettakis, George</au><au>Bousseau, Adrien</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Single-image SVBRDF capture with a rendering-aware deep network</atitle><jtitle>ACM transactions on graphics</jtitle><date>2018</date><risdate>2018</risdate><volume>37</volume><issue>4</issue><spage>1</spage><epage>15</epage><pages>1-15</pages><issn>0730-0301</issn><eissn>1557-7368</eissn><abstract>Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.</abstract><pub>Association for Computing Machinery</pub><doi>10.1145/3197517.3201378</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-9254-4819</orcidid><orcidid>https://orcid.org/0000-0002-8003-9575</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0730-0301
ispartof	ACM transactions on graphics, 2018, Vol.37 (4), p.1-15
issn	0730-0301 1557-7368
language	eng
recordid	cdi_hal_primary_oai_HAL_hal_01793826v2
source	ACM Digital Library Complete
subjects	Computer Science Image Processing Neural and Evolutionary Computing
title	Single-image SVBRDF capture with a rendering-aware deep network
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T12%3A09%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-hal_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Single-image%20SVBRDF%20capture%20with%20a%20rendering-aware%20deep%20network&rft.jtitle=ACM%20transactions%20on%20graphics&rft.au=Deschaintre,%20Valentin&rft.date=2018&rft.volume=37&rft.issue=4&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.issn=0730-0301&rft.eissn=1557-7368&rft_id=info:doi/10.1145/3197517.3201378&rft_dat=%3Chal_cross%3Eoai_HAL_hal_01793826v2%3C/hal_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true