Neural Head Avatars from Monocular RGB Videos

We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. Our representation...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Grassal, Philip-William, Prinzler, Malte, Leistner, Titus, Rother, Carsten, Nießner, Matthias, Thies, Justus
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Grassal, Philip-William
Prinzler, Malte
Leistner, Titus
Rother, Carsten
Nießner, Matthias
Thies, Justus
description We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. Our representation can be learned from a monocular RGB portrait video that features a range of different expressions and views. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture. We demonstrate that this representation is able to accurately extrapolate to unseen poses and view points, and generates natural expressions while providing sharp texture details. Compared to previous works on head avatars, our method provides a disentangled shape and appearance model of the complete human head (including hair) that is compatible with the standard graphics pipeline. Moreover, it quantitatively and qualitatively outperforms current state of the art in terms of reconstruction quality and novel-view synthesis.
doi_str_mv 10.48550/arxiv.2112.01554
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2112_01554</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2112_01554</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-ae7c528c00a5765c54e03dc7e895e1b64690d7b1f75c57bf7265ea5e1794f1e93</originalsourceid><addsrcrecordid>eNotzs2KwjAUhuFsXIh6Aa7MDbQmbU5Ou3TEP_AHRNyW0_QECtUOqYpz96POrL7FCx-PEGOtYpMBqCmFZ_2IE62TWGkA0xfRnu-BGrlmquTsQTcKnfShvchde23dvaEgj6svea4rbruh6HlqOh7970CclovTfB1tD6vNfLaNyKKJiNFBkjmlCNCCA8MqrRxylgPr0hqbqwpL7fHVsPSYWGB6JcyN15ynAzH5u_14i-9QXyj8FG938XGnv65CO50</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Neural Head Avatars from Monocular RGB Videos</title><source>arXiv.org</source><creator>Grassal, Philip-William ; Prinzler, Malte ; Leistner, Titus ; Rother, Carsten ; Nießner, Matthias ; Thies, Justus</creator><creatorcontrib>Grassal, Philip-William ; Prinzler, Malte ; Leistner, Titus ; Rother, Carsten ; Nießner, Matthias ; Thies, Justus</creatorcontrib><description>We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. Our representation can be learned from a monocular RGB portrait video that features a range of different expressions and views. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture. We demonstrate that this representation is able to accurately extrapolate to unseen poses and view points, and generates natural expressions while providing sharp texture details. Compared to previous works on head avatars, our method provides a disentangled shape and appearance model of the complete human head (including hair) that is compatible with the standard graphics pipeline. Moreover, it quantitatively and qualitatively outperforms current state of the art in terms of reconstruction quality and novel-view synthesis.</description><identifier>DOI: 10.48550/arxiv.2112.01554</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Graphics</subject><creationdate>2021-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2112.01554$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2112.01554$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Grassal, Philip-William</creatorcontrib><creatorcontrib>Prinzler, Malte</creatorcontrib><creatorcontrib>Leistner, Titus</creatorcontrib><creatorcontrib>Rother, Carsten</creatorcontrib><creatorcontrib>Nießner, Matthias</creatorcontrib><creatorcontrib>Thies, Justus</creatorcontrib><title>Neural Head Avatars from Monocular RGB Videos</title><description>We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. Our representation can be learned from a monocular RGB portrait video that features a range of different expressions and views. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture. We demonstrate that this representation is able to accurately extrapolate to unseen poses and view points, and generates natural expressions while providing sharp texture details. Compared to previous works on head avatars, our method provides a disentangled shape and appearance model of the complete human head (including hair) that is compatible with the standard graphics pipeline. Moreover, it quantitatively and qualitatively outperforms current state of the art in terms of reconstruction quality and novel-view synthesis.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Graphics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzs2KwjAUhuFsXIh6Aa7MDbQmbU5Ou3TEP_AHRNyW0_QECtUOqYpz96POrL7FCx-PEGOtYpMBqCmFZ_2IE62TWGkA0xfRnu-BGrlmquTsQTcKnfShvchde23dvaEgj6svea4rbruh6HlqOh7970CclovTfB1tD6vNfLaNyKKJiNFBkjmlCNCCA8MqrRxylgPr0hqbqwpL7fHVsPSYWGB6JcyN15ynAzH5u_14i-9QXyj8FG938XGnv65CO50</recordid><startdate>20211202</startdate><enddate>20211202</enddate><creator>Grassal, Philip-William</creator><creator>Prinzler, Malte</creator><creator>Leistner, Titus</creator><creator>Rother, Carsten</creator><creator>Nießner, Matthias</creator><creator>Thies, Justus</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211202</creationdate><title>Neural Head Avatars from Monocular RGB Videos</title><author>Grassal, Philip-William ; Prinzler, Malte ; Leistner, Titus ; Rother, Carsten ; Nießner, Matthias ; Thies, Justus</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-ae7c528c00a5765c54e03dc7e895e1b64690d7b1f75c57bf7265ea5e1794f1e93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Graphics</topic><toplevel>online_resources</toplevel><creatorcontrib>Grassal, Philip-William</creatorcontrib><creatorcontrib>Prinzler, Malte</creatorcontrib><creatorcontrib>Leistner, Titus</creatorcontrib><creatorcontrib>Rother, Carsten</creatorcontrib><creatorcontrib>Nießner, Matthias</creatorcontrib><creatorcontrib>Thies, Justus</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Grassal, Philip-William</au><au>Prinzler, Malte</au><au>Leistner, Titus</au><au>Rother, Carsten</au><au>Nießner, Matthias</au><au>Thies, Justus</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Neural Head Avatars from Monocular RGB Videos</atitle><date>2021-12-02</date><risdate>2021</risdate><abstract>We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar that can be used for teleconferencing in AR/VR or other applications in the movie or games industry that rely on a digital human. Our representation can be learned from a monocular RGB portrait video that features a range of different expressions and views. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture. We demonstrate that this representation is able to accurately extrapolate to unseen poses and view points, and generates natural expressions while providing sharp texture details. Compared to previous works on head avatars, our method provides a disentangled shape and appearance model of the complete human head (including hair) that is compatible with the standard graphics pipeline. Moreover, it quantitatively and qualitatively outperforms current state of the art in terms of reconstruction quality and novel-view synthesis.</abstract><doi>10.48550/arxiv.2112.01554</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2112.01554
ispartof
issn
language eng
recordid cdi_arxiv_primary_2112_01554
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
Computer Science - Graphics
title Neural Head Avatars from Monocular RGB Videos
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T18%3A43%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Neural%20Head%20Avatars%20from%20Monocular%20RGB%20Videos&rft.au=Grassal,%20Philip-William&rft.date=2021-12-02&rft_id=info:doi/10.48550/arxiv.2112.01554&rft_dat=%3Carxiv_GOX%3E2112_01554%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true