MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera

We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-10
Hauptverfasser: Su, Zhaoqi, Wan, Weilin, Yu, Tao, Liu, Lingjie, Lu, Fang, Wang, Wenping, Liu, Yebin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Su, Zhaoqi
Wan, Weilin
Yu, Tao
Liu, Lingjie
Lu, Fang
Wang, Wenping
Liu, Yebin
description We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.
doi_str_mv 10.48550/arxiv.2004.05815
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2004_05815</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2389384845</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-fe5fa2276995a210272f647f9d8056796547ef331d39f779e0e6db7835bb360f3</originalsourceid><addsrcrecordid>eNotj0tPwzAQhC0kJKrSH8AJS5xTnLXXD25VBBSpBQ6Fa-Q0NkrVPHASRP99TctpZzXfrmYIuUnZXGhEdm_Db_UzB8bEnKFO8YJMgPM00QLgisz6fscYA6kAkU_I63rc20Nmuwca1VAlcXOBLsfaNvTdBd-GqLaORmQYg6MffdV80QVdt027jbeBflala6Nfu2CvyaW3-97N_ueUbJ4eN9kyWb09v2SLVWIRMPEOvQVQ0hi0kDJQ4KVQ3pSaoVRGolDOx9AlN14p45iTZaE0x6Lgknk-Jbfnt6eyeReq2oZD_lc6P5WOxN2Z6EL7Pbp-yHftGJqYKQeuDddCC-RHepdYMg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2389384845</pqid></control><display><type>article</type><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</creator><creatorcontrib>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</creatorcontrib><description>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2004.05815</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Albedo ; Camcorders ; Cameras ; Cloth ; Computer Science - Computer Vision and Pattern Recognition ; Computer simulation ; Decomposition ; Editing ; Finite element method ; Geometry ; Human performance ; Multilayers ; Reconstruction ; Rendering ; Shading ; Texture</subject><ispartof>arXiv.org, 2020-10</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2004.05815$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/TVCG.2020.3027763$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Su, Zhaoqi</creatorcontrib><creatorcontrib>Wan, Weilin</creatorcontrib><creatorcontrib>Yu, Tao</creatorcontrib><creatorcontrib>Liu, Lingjie</creatorcontrib><creatorcontrib>Lu, Fang</creatorcontrib><creatorcontrib>Wang, Wenping</creatorcontrib><creatorcontrib>Liu, Yebin</creatorcontrib><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><title>arXiv.org</title><description>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</description><subject>Albedo</subject><subject>Camcorders</subject><subject>Cameras</subject><subject>Cloth</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer simulation</subject><subject>Decomposition</subject><subject>Editing</subject><subject>Finite element method</subject><subject>Geometry</subject><subject>Human performance</subject><subject>Multilayers</subject><subject>Reconstruction</subject><subject>Rendering</subject><subject>Shading</subject><subject>Texture</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj0tPwzAQhC0kJKrSH8AJS5xTnLXXD25VBBSpBQ6Fa-Q0NkrVPHASRP99TctpZzXfrmYIuUnZXGhEdm_Db_UzB8bEnKFO8YJMgPM00QLgisz6fscYA6kAkU_I63rc20Nmuwca1VAlcXOBLsfaNvTdBd-GqLaORmQYg6MffdV80QVdt027jbeBflala6Nfu2CvyaW3-97N_ueUbJ4eN9kyWb09v2SLVWIRMPEOvQVQ0hi0kDJQ4KVQ3pSaoVRGolDOx9AlN14p45iTZaE0x6Lgknk-Jbfnt6eyeReq2oZD_lc6P5WOxN2Z6EL7Pbp-yHftGJqYKQeuDddCC-RHepdYMg</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Su, Zhaoqi</creator><creator>Wan, Weilin</creator><creator>Yu, Tao</creator><creator>Liu, Lingjie</creator><creator>Lu, Fang</creator><creator>Wang, Wenping</creator><creator>Liu, Yebin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201001</creationdate><title>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</title><author>Su, Zhaoqi ; Wan, Weilin ; Yu, Tao ; Liu, Lingjie ; Lu, Fang ; Wang, Wenping ; Liu, Yebin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-fe5fa2276995a210272f647f9d8056796547ef331d39f779e0e6db7835bb360f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Albedo</topic><topic>Camcorders</topic><topic>Cameras</topic><topic>Cloth</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer simulation</topic><topic>Decomposition</topic><topic>Editing</topic><topic>Finite element method</topic><topic>Geometry</topic><topic>Human performance</topic><topic>Multilayers</topic><topic>Reconstruction</topic><topic>Rendering</topic><topic>Shading</topic><topic>Texture</topic><toplevel>online_resources</toplevel><creatorcontrib>Su, Zhaoqi</creatorcontrib><creatorcontrib>Wan, Weilin</creatorcontrib><creatorcontrib>Yu, Tao</creatorcontrib><creatorcontrib>Liu, Lingjie</creatorcontrib><creatorcontrib>Lu, Fang</creatorcontrib><creatorcontrib>Wang, Wenping</creatorcontrib><creatorcontrib>Liu, Yebin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Su, Zhaoqi</au><au>Wan, Weilin</au><au>Yu, Tao</au><au>Liu, Lingjie</au><au>Lu, Fang</au><au>Wang, Wenping</au><au>Liu, Yebin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera</atitle><jtitle>arXiv.org</jtitle><date>2020-10-01</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer. The key technique behind is a Garment-from-Video (GfV) method for optimizing the garment shape and reconstructing the dynamic cloth to fit the input video sequence, based on a cloth simulation model which is effectively solved with gradient descent. For texture rendering, we decompose each input image frame into a shading layer and an albedo layer, and propose a method for fusing a fixed albedo map and solving for detailed garment geometry using the shading layer. Compared with existing single view human performance capture systems, our "multi-layer" approach bypasses the tedious and time consuming scanning step for obtaining a human specific mesh template. Experimental results demonstrate that MulayCap produces realistic rendering of dynamically changing details that has not been achieved in any previous monocular video camera systems. Benefiting from its fully semantic modeling, MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2004.05815</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2020-10
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2004_05815
source arXiv.org; Free E- Journals
subjects Albedo
Camcorders
Cameras
Cloth
Computer Science - Computer Vision and Pattern Recognition
Computer simulation
Decomposition
Editing
Finite element method
Geometry
Human performance
Multilayers
Reconstruction
Rendering
Shading
Texture
title MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A50%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MulayCap:%20Multi-layer%20Human%20Performance%20Capture%20Using%20A%20Monocular%20Video%20Camera&rft.jtitle=arXiv.org&rft.au=Su,%20Zhaoqi&rft.date=2020-10-01&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2004.05815&rft_dat=%3Cproquest_arxiv%3E2389384845%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2389384845&rft_id=info:pmid/&rfr_iscdi=true