Human Gaussian Splatting: Real-time Rendering of Animatable Avatars

This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos. While the classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural body representations that achieve impressi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Moreau, Arthur, Song, Jifei, Dhamo, Helisa, Shaw, Richard, Zhou, Yiren, Pérez-Pellitero, Eduardo
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Graphics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Moreau, Arthur Song, Jifei Dhamo, Helisa Shaw, Richard Zhou, Yiren Pérez-Pellitero, Eduardo
description	This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos. While the classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural body representations that achieve impressive visual quality. However, these models are difficult to render in real-time and their quality degrades when the character is animated with body poses different than the training observations. We propose an animatable human model based on 3D Gaussian Splatting, that has recently emerged as a very efficient alternative to neural radiance fields. The body is represented by a set of gaussian primitives in a canonical space which is deformed with a coarse to fine approach that combines forward skinning and local non-rigid refinement. We describe how to learn our Human Gaussian Splatting (HuGS) model in an end-to-end fashion from multi-view observations, and evaluate it against the state-of-the-art approaches for novel pose synthesis of clothed body. Our method achieves 1.5 dB PSNR improvement over the state-of-the-art on THuman4 dataset while being able to render in real-time (80 fps for 512x512 resolution).
doi_str_mv	10.48550/arxiv.2311.17113
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_17113</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_17113</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-91fd0eaa2d82b4bda0d45ab60352aed2133e393475fd5d22119baed574839c7c3</originalsourceid><addsrcrecordid>eNotj8tqwzAURLXpoqT9gK6qH7CrqytFdnfGpEkgEEizN1eRXAS2G2QnpH9f5bGawyyGOYy9gchVobX4oHgJ51wiQA4GAJ9ZvTr1NPAlncYxJPg-djRNYfj55DtPXTaF3icanI-p5L8tr4bQ00S287w6J4jjC3tqqRv96yNnbP-12NerbLNdrutqk9HcYFZC64Qnkq6QVllHwilNdi5QS_JOAqLHEpXRrdNOSoDSpl4bVWB5MAecsff77M2iOcb0I_41V5vmZoP_zdNElA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Human Gaussian Splatting: Real-time Rendering of Animatable Avatars</title><source>arXiv.org</source><creator>Moreau, Arthur ; Song, Jifei ; Dhamo, Helisa ; Shaw, Richard ; Zhou, Yiren ; Pérez-Pellitero, Eduardo</creator><creatorcontrib>Moreau, Arthur ; Song, Jifei ; Dhamo, Helisa ; Shaw, Richard ; Zhou, Yiren ; Pérez-Pellitero, Eduardo</creatorcontrib><description>This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos. While the classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural body representations that achieve impressive visual quality. However, these models are difficult to render in real-time and their quality degrades when the character is animated with body poses different than the training observations. We propose an animatable human model based on 3D Gaussian Splatting, that has recently emerged as a very efficient alternative to neural radiance fields. The body is represented by a set of gaussian primitives in a canonical space which is deformed with a coarse to fine approach that combines forward skinning and local non-rigid refinement. We describe how to learn our Human Gaussian Splatting (HuGS) model in an end-to-end fashion from multi-view observations, and evaluate it against the state-of-the-art approaches for novel pose synthesis of clothed body. Our method achieves 1.5 dB PSNR improvement over the state-of-the-art on THuman4 dataset while being able to render in real-time (80 fps for 512x512 resolution).</description><identifier>DOI: 10.48550/arxiv.2311.17113</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Graphics</subject><creationdate>2023-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.17113$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.17113$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Moreau, Arthur</creatorcontrib><creatorcontrib>Song, Jifei</creatorcontrib><creatorcontrib>Dhamo, Helisa</creatorcontrib><creatorcontrib>Shaw, Richard</creatorcontrib><creatorcontrib>Zhou, Yiren</creatorcontrib><creatorcontrib>Pérez-Pellitero, Eduardo</creatorcontrib><title>Human Gaussian Splatting: Real-time Rendering of Animatable Avatars</title><description>This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos. While the classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural body representations that achieve impressive visual quality. However, these models are difficult to render in real-time and their quality degrades when the character is animated with body poses different than the training observations. We propose an animatable human model based on 3D Gaussian Splatting, that has recently emerged as a very efficient alternative to neural radiance fields. The body is represented by a set of gaussian primitives in a canonical space which is deformed with a coarse to fine approach that combines forward skinning and local non-rigid refinement. We describe how to learn our Human Gaussian Splatting (HuGS) model in an end-to-end fashion from multi-view observations, and evaluate it against the state-of-the-art approaches for novel pose synthesis of clothed body. Our method achieves 1.5 dB PSNR improvement over the state-of-the-art on THuman4 dataset while being able to render in real-time (80 fps for 512x512 resolution).</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Graphics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tqwzAURLXpoqT9gK6qH7CrqytFdnfGpEkgEEizN1eRXAS2G2QnpH9f5bGawyyGOYy9gchVobX4oHgJ51wiQA4GAJ9ZvTr1NPAlncYxJPg-djRNYfj55DtPXTaF3icanI-p5L8tr4bQ00S287w6J4jjC3tqqRv96yNnbP-12NerbLNdrutqk9HcYFZC64Qnkq6QVllHwilNdi5QS_JOAqLHEpXRrdNOSoDSpl4bVWB5MAecsff77M2iOcb0I_41V5vmZoP_zdNElA</recordid><startdate>20231128</startdate><enddate>20231128</enddate><creator>Moreau, Arthur</creator><creator>Song, Jifei</creator><creator>Dhamo, Helisa</creator><creator>Shaw, Richard</creator><creator>Zhou, Yiren</creator><creator>Pérez-Pellitero, Eduardo</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231128</creationdate><title>Human Gaussian Splatting: Real-time Rendering of Animatable Avatars</title><author>Moreau, Arthur ; Song, Jifei ; Dhamo, Helisa ; Shaw, Richard ; Zhou, Yiren ; Pérez-Pellitero, Eduardo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-91fd0eaa2d82b4bda0d45ab60352aed2133e393475fd5d22119baed574839c7c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Graphics</topic><toplevel>online_resources</toplevel><creatorcontrib>Moreau, Arthur</creatorcontrib><creatorcontrib>Song, Jifei</creatorcontrib><creatorcontrib>Dhamo, Helisa</creatorcontrib><creatorcontrib>Shaw, Richard</creatorcontrib><creatorcontrib>Zhou, Yiren</creatorcontrib><creatorcontrib>Pérez-Pellitero, Eduardo</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Moreau, Arthur</au><au>Song, Jifei</au><au>Dhamo, Helisa</au><au>Shaw, Richard</au><au>Zhou, Yiren</au><au>Pérez-Pellitero, Eduardo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Human Gaussian Splatting: Real-time Rendering of Animatable Avatars</atitle><date>2023-11-28</date><risdate>2023</risdate><abstract>This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos. While the classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural body representations that achieve impressive visual quality. However, these models are difficult to render in real-time and their quality degrades when the character is animated with body poses different than the training observations. We propose an animatable human model based on 3D Gaussian Splatting, that has recently emerged as a very efficient alternative to neural radiance fields. The body is represented by a set of gaussian primitives in a canonical space which is deformed with a coarse to fine approach that combines forward skinning and local non-rigid refinement. We describe how to learn our Human Gaussian Splatting (HuGS) model in an end-to-end fashion from multi-view observations, and evaluate it against the state-of-the-art approaches for novel pose synthesis of clothed body. Our method achieves 1.5 dB PSNR improvement over the state-of-the-art on THuman4 dataset while being able to render in real-time (80 fps for 512x512 resolution).</abstract><doi>10.48550/arxiv.2311.17113</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2311.17113
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2311_17113
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Graphics
title	Human Gaussian Splatting: Real-time Rendering of Animatable Avatars
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T19%3A35%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Human%20Gaussian%20Splatting:%20Real-time%20Rendering%20of%20Animatable%20Avatars&rft.au=Moreau,%20Arthur&rft.date=2023-11-28&rft_id=info:doi/10.48550/arxiv.2311.17113&rft_dat=%3Carxiv_GOX%3E2311_17113%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true