MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness

Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on consumer electronics 2024-08, Vol.70 (3), p.5115-5127
Hauptverfasser:	Du, Chenghu, Xiong, Shengwu
Format:	Artikel
Sprache:	eng
Schlagworte:	Biological system modeling Computational modeling consumer technology Deformable models Deformation generative adversarial network Monocular-to-3D human modeling Solid modeling Three-dimensional displays virtual try-on
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5127
container_issue	3
container_start_page	5115
container_title	IEEE transactions on consumer electronics
container_volume	70
creator	Du, Chenghu Xiong, Shengwu
description	Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.
doi_str_mv	10.1109/TCE.2024.3410989
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCE_2024_3410989</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10552157</ieee_id><sourcerecordid>10_1109_TCE_2024_3410989</sourcerecordid><originalsourceid>FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRMFbvHjzkC2yc_ZNk11uo1SoNHtJ72GwmEkmTkk0Uv70b2oMw8JjHvMfwI-SeQcQY6Mf9ehNx4DIS0q9KX5CAxbGikvH0kgTgPSogEdfkxrkvACZjrgLyns9F1uP0FOZDP9i5MyOdBiroc7idD6b3do1d23-G360J87mbWlpY02FYHM3Umi7MfsyIPTp3S64a0zm8O-uKFC-b_XpLdx-vb-tsR61WCeWVVExZaEBK4b-ABBE0r1GIhGlZNdUylZY1MKZSboyWieG24gorsGJF4NRqx8G5EZvyOLYHM_6WDMqFROlJlAuJ8kzCRx5OkRYR_53HMWdxKv4A9a9Y4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><source>IEEE Electronic Library (IEL)</source><creator>Du, Chenghu ; Xiong, Shengwu</creator><creatorcontrib>Du, Chenghu ; Xiong, Shengwu</creatorcontrib><description>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</description><identifier>ISSN: 0098-3063</identifier><identifier>EISSN: 1558-4127</identifier><identifier>DOI: 10.1109/TCE.2024.3410989</identifier><identifier>CODEN: ITCEDA</identifier><language>eng</language><publisher>IEEE</publisher><subject>Biological system modeling ; Computational modeling ; consumer technology ; Deformable models ; Deformation ; generative adversarial network ; Monocular-to-3D human modeling ; Solid modeling ; Three-dimensional displays ; virtual try-on</subject><ispartof>IEEE transactions on consumer electronics, 2024-08, Vol.70 (3), p.5115-5127</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</cites><orcidid>0000-0001-7275-5064 ; 0000-0002-4006-7029</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10552157$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10552157$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Du, Chenghu</creatorcontrib><creatorcontrib>Xiong, Shengwu</creatorcontrib><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><title>IEEE transactions on consumer electronics</title><addtitle>T-CE</addtitle><description>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</description><subject>Biological system modeling</subject><subject>Computational modeling</subject><subject>consumer technology</subject><subject>Deformable models</subject><subject>Deformation</subject><subject>generative adversarial network</subject><subject>Monocular-to-3D human modeling</subject><subject>Solid modeling</subject><subject>Three-dimensional displays</subject><subject>virtual try-on</subject><issn>0098-3063</issn><issn>1558-4127</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9Lw0AQxRdRMFbvHjzkC2yc_ZNk11uo1SoNHtJ72GwmEkmTkk0Uv70b2oMw8JjHvMfwI-SeQcQY6Mf9ehNx4DIS0q9KX5CAxbGikvH0kgTgPSogEdfkxrkvACZjrgLyns9F1uP0FOZDP9i5MyOdBiroc7idD6b3do1d23-G360J87mbWlpY02FYHM3Umi7MfsyIPTp3S64a0zm8O-uKFC-b_XpLdx-vb-tsR61WCeWVVExZaEBK4b-ABBE0r1GIhGlZNdUylZY1MKZSboyWieG24gorsGJF4NRqx8G5EZvyOLYHM_6WDMqFROlJlAuJ8kzCRx5OkRYR_53HMWdxKv4A9a9Y4g</recordid><startdate>202408</startdate><enddate>202408</enddate><creator>Du, Chenghu</creator><creator>Xiong, Shengwu</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-7275-5064</orcidid><orcidid>https://orcid.org/0000-0002-4006-7029</orcidid></search><sort><creationdate>202408</creationdate><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><author>Du, Chenghu ; Xiong, Shengwu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Biological system modeling</topic><topic>Computational modeling</topic><topic>consumer technology</topic><topic>Deformable models</topic><topic>Deformation</topic><topic>generative adversarial network</topic><topic>Monocular-to-3D human modeling</topic><topic>Solid modeling</topic><topic>Three-dimensional displays</topic><topic>virtual try-on</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Du, Chenghu</creatorcontrib><creatorcontrib>Xiong, Shengwu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on consumer electronics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Du, Chenghu</au><au>Xiong, Shengwu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</atitle><jtitle>IEEE transactions on consumer electronics</jtitle><stitle>T-CE</stitle><date>2024-08</date><risdate>2024</risdate><volume>70</volume><issue>3</issue><spage>5115</spage><epage>5127</epage><pages>5115-5127</pages><issn>0098-3063</issn><eissn>1558-4127</eissn><coden>ITCEDA</coden><abstract>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</abstract><pub>IEEE</pub><doi>10.1109/TCE.2024.3410989</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-7275-5064</orcidid><orcidid>https://orcid.org/0000-0002-4006-7029</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0098-3063
ispartof	IEEE transactions on consumer electronics, 2024-08, Vol.70 (3), p.5115-5127
issn	0098-3063 1558-4127
language	eng
recordid	cdi_crossref_primary_10_1109_TCE_2024_3410989
source	IEEE Electronic Library (IEL)
subjects	Biological system modeling Computational modeling consumer technology Deformable models Deformation generative adversarial network Monocular-to-3D human modeling Solid modeling Three-dimensional displays virtual try-on
title	MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T22%3A07%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MuSAnet:%20Monocular-to-3-D%20Human%20Modeling%20via%20Multi-Scale%20Spatial%20Awareness&rft.jtitle=IEEE%20transactions%20on%20consumer%20electronics&rft.au=Du,%20Chenghu&rft.date=2024-08&rft.volume=70&rft.issue=3&rft.spage=5115&rft.epage=5127&rft.pages=5115-5127&rft.issn=0098-3063&rft.eissn=1558-4127&rft.coden=ITCEDA&rft_id=info:doi/10.1109/TCE.2024.3410989&rft_dat=%3Ccrossref_RIE%3E10_1109_TCE_2024_3410989%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10552157&rfr_iscdi=true