MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera

We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-10
Hauptverfasser:	Su, Zhaoqi, Wan, Weilin, Yu, Tao, Liu, Lingjie, Lu, Fang, Wang, Wenping, Liu, Yebin
Format:	Artikel
Sprache:	eng
Schlagworte:	Albedo Camcorders Cameras Cloth Computer Science - Computer Vision and Pattern Recognition Computer simulation Decomposition Editing Finite element method Geometry Human performance Multilayers Reconstruction Rendering Shading Texture
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Su, Zhaoqi Wan, Weilin Yu, Tao Liu, Lingjie Lu, Fang Wang, Wenping Liu, Yebin
description	We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.
doi_str_mv	10.48550/arxiv.2004.05815
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2004_05815</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2389384845</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-fe5fa2276995a210272f647f9d8056796547ef331d39f779e0e6db7835bb360f3</originalsourceid><addsrcrecordid>eNotj0tPwzAQhC0kJKrSH8AJS5xTnLXXD25VBBSpBQ6Fa-Q0NkrVPHASRP99TctpZzXfrmYIuUnZXGhEdm_Db_UzB8bEnKFO8YJMgPM00QLgisz6fscYA6kAkU_I63rc20Nmuwca1VAlcXOBLsfaNvTdBd-GqLaORmQYg6MffdV80QVdt027jbeBflala6Nfu2CvyaW3-97N_ueUbJ4eN9kyWb09v2SLVWIRMPEOvQVQ0hi0kDJQ4KVQ3pSaoVRGolDOx9AlN14p45iTZaE0x6Lgknk-Jbfnt6eyeReq2oZD_lc6P5WOxN2Z6EL7Pbp-yHftGJqYKQeuDddCC-RHepdYMg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2389384845</pqid></control><display><type>article</type><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</creator><creatorcontrib>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</creatorcontrib><description>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2004.05815</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Albedo ; Camcorders ; Cameras ; Cloth ; Computer Science - Computer Vision and Pattern Recognition ; Computer simulation ; Decomposition ; Editing ; Finite element method ; Geometry ; Human performance ; Multilayers ; Reconstruction ; Rendering ; Shading ; Texture</subject><ispartof>arXiv.org, 2020-10</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2004.05815$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/TVCG.2020.3027763$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Su, Zhaoqi</creatorcontrib><creatorcontrib>Wan, Weilin</creatorcontrib><creatorcontrib>Yu, Tao</creatorcontrib><creatorcontrib>Liu, Lingjie</creatorcontrib><creatorcontrib>Lu, Fang</creatorcontrib><creatorcontrib>Wang, Wenping</creatorcontrib><creatorcontrib>Liu, Yebin</creatorcontrib><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><title>arXiv.org</title><description>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</description><subject>Albedo</subject><subject>Camcorders</subject><subject>Cameras</subject><subject>Cloth</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer simulation</subject><subject>Decomposition</subject><subject>Editing</subject><subject>Finite element method</subject><subject>Geometry</subject><subject>Human performance</subject><subject>Multilayers</subject><subject>Reconstruction</subject><subject>Rendering</subject><subject>Shading</subject><subject>Texture</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj0tPwzAQhC0kJKrSH8AJS5xTnLXXD25VBBSpBQ6Fa-Q0NkrVPHASRP99TctpZzXfrmYIuUnZXGhEdm_Db_UzB8bEnKFO8YJMgPM00QLgisz6fscYA6kAkU_I63rc20Nmuwca1VAlcXOBLsfaNvTdBd-GqLaORmQYg6MffdV80QVdt027jbeBflala6Nfu2CvyaW3-97N_ueUbJ4eN9kyWb09v2SLVWIRMPEOvQVQ0hi0kDJQ4KVQ3pSaoVRGolDOx9AlN14p45iTZaE0x6Lgknk-Jbfnt6eyeReq2oZD_lc6P5WOxN2Z6EL7Pbp-yHftGJqYKQeuDddCC-RHepdYMg</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Su, Zhaoqi</creator><creator>Wan, Weilin</creator><creator>Yu, Tao</creator><creator>Liu, Lingjie</creator><creator>Lu, Fang</creator><creator>Wang, Wenping</creator><creator>Liu, Yebin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201001</creationdate><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><author>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-fe5fa2276995a210272f647f9d8056796547ef331d39f779e0e6db7835bb360f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Albedo</topic><topic>Camcorders</topic><topic>Cameras</topic><topic>Cloth</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer simulation</topic><topic>Decomposition</topic><topic>Editing</topic><topic>Finite element method</topic><topic>Geometry</topic><topic>Human performance</topic><topic>Multilayers</topic><topic>Reconstruction</topic><topic>Rendering</topic><topic>Shading</topic><topic>Texture</topic><toplevel>online_resources</toplevel><creatorcontrib>Su, Zhaoqi</creatorcontrib><creatorcontrib>Wan, Weilin</creatorcontrib><creatorcontrib>Yu, Tao</creatorcontrib><creatorcontrib>Liu, Lingjie</creatorcontrib><creatorcontrib>Lu, Fang</creatorcontrib><creatorcontrib>Wang, Wenping</creatorcontrib><creatorcontrib>Liu, Yebin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Su, Zhaoqi</au><au>Wan, Weilin</au><au>Yu, Tao</au><au>Liu, Lingjie</au><au>Lu, Fang</au><au>Wang, Wenping</au><au>Liu, Yebin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</atitle><jtitle>arXiv.org</jtitle><date>2020-10-01</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2004.05815</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-10
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2004_05815
source	arXiv.org; Free E- Journals
subjects	Albedo Camcorders Cameras Cloth Computer Science - Computer Vision and Pattern Recognition Computer simulation Decomposition Editing Finite element method Geometry Human performance Multilayers Reconstruction Rendering Shading Texture
title	MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A50%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MulayCap:%20Multi-layer%20Human%20Performance%20Capture%20Using%20A%20Monocular%20Video%20Camera&rft.jtitle=arXiv.org&rft.au=Su,%20Zhaoqi&rft.date=2020-10-01&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2004.05815&rft_dat=%3Cproquest_arxiv%3E2389384845%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2389384845&rft_id=info:pmid/&rfr_iscdi=true