MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness

Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on consumer electronics 2024-08, Vol.70 (3), p.5115-5127
Hauptverfasser: Du, Chenghu, Xiong, Shengwu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5127
container_issue 3
container_start_page 5115
container_title IEEE transactions on consumer electronics
container_volume 70
creator Du, Chenghu
Xiong, Shengwu
description Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.
doi_str_mv 10.1109/TCE.2024.3410989
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCE_2024_3410989</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10552157</ieee_id><sourcerecordid>10_1109_TCE_2024_3410989</sourcerecordid><originalsourceid>FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRMFbvHjzkC2yc_ZNk11uo1SoNHtJ72GwmEkmTkk0Uv70b2oMw8JjHvMfwI-SeQcQY6Mf9ehNx4DIS0q9KX5CAxbGikvH0kgTgPSogEdfkxrkvACZjrgLyns9F1uP0FOZDP9i5MyOdBiroc7idD6b3do1d23-G360J87mbWlpY02FYHM3Umi7MfsyIPTp3S64a0zm8O-uKFC-b_XpLdx-vb-tsR61WCeWVVExZaEBK4b-ABBE0r1GIhGlZNdUylZY1MKZSboyWieG24gorsGJF4NRqx8G5EZvyOLYHM_6WDMqFROlJlAuJ8kzCRx5OkRYR_53HMWdxKv4A9a9Y4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><source>IEEE Electronic Library (IEL)</source><creator>Du, Chenghu ; Xiong, Shengwu</creator><creatorcontrib>Du, Chenghu ; Xiong, Shengwu</creatorcontrib><description>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</description><identifier>ISSN: 0098-3063</identifier><identifier>EISSN: 1558-4127</identifier><identifier>DOI: 10.1109/TCE.2024.3410989</identifier><identifier>CODEN: ITCEDA</identifier><language>eng</language><publisher>IEEE</publisher><subject>Biological system modeling ; Computational modeling ; consumer technology ; Deformable models ; Deformation ; generative adversarial network ; Monocular-to-3D human modeling ; Solid modeling ; Three-dimensional displays ; virtual try-on</subject><ispartof>IEEE transactions on consumer electronics, 2024-08, Vol.70 (3), p.5115-5127</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</cites><orcidid>0000-0001-7275-5064 ; 0000-0002-4006-7029</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10552157$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10552157$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Du, Chenghu</creatorcontrib><creatorcontrib>Xiong, Shengwu</creatorcontrib><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><title>IEEE transactions on consumer electronics</title><addtitle>T-CE</addtitle><description>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</description><subject>Biological system modeling</subject><subject>Computational modeling</subject><subject>consumer technology</subject><subject>Deformable models</subject><subject>Deformation</subject><subject>generative adversarial network</subject><subject>Monocular-to-3D human modeling</subject><subject>Solid modeling</subject><subject>Three-dimensional displays</subject><subject>virtual try-on</subject><issn>0098-3063</issn><issn>1558-4127</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9Lw0AQxRdRMFbvHjzkC2yc_ZNk11uo1SoNHtJ72GwmEkmTkk0Uv70b2oMw8JjHvMfwI-SeQcQY6Mf9ehNx4DIS0q9KX5CAxbGikvH0kgTgPSogEdfkxrkvACZjrgLyns9F1uP0FOZDP9i5MyOdBiroc7idD6b3do1d23-G360J87mbWlpY02FYHM3Umi7MfsyIPTp3S64a0zm8O-uKFC-b_XpLdx-vb-tsR61WCeWVVExZaEBK4b-ABBE0r1GIhGlZNdUylZY1MKZSboyWieG24gorsGJF4NRqx8G5EZvyOLYHM_6WDMqFROlJlAuJ8kzCRx5OkRYR_53HMWdxKv4A9a9Y4g</recordid><startdate>202408</startdate><enddate>202408</enddate><creator>Du, Chenghu</creator><creator>Xiong, Shengwu</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-7275-5064</orcidid><orcidid>https://orcid.org/0000-0002-4006-7029</orcidid></search><sort><creationdate>202408</creationdate><title>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</title><author>Du, Chenghu ; Xiong, Shengwu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c986-2b4818c0f044301406ee092de336194bfbbfbbb94d011872aa946a2cb28eb0c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Biological system modeling</topic><topic>Computational modeling</topic><topic>consumer technology</topic><topic>Deformable models</topic><topic>Deformation</topic><topic>generative adversarial network</topic><topic>Monocular-to-3D human modeling</topic><topic>Solid modeling</topic><topic>Three-dimensional displays</topic><topic>virtual try-on</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Du, Chenghu</creatorcontrib><creatorcontrib>Xiong, Shengwu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on consumer electronics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Du, Chenghu</au><au>Xiong, Shengwu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness</atitle><jtitle>IEEE transactions on consumer electronics</jtitle><stitle>T-CE</stitle><date>2024-08</date><risdate>2024</risdate><volume>70</volume><issue>3</issue><spage>5115</spage><epage>5127</epage><pages>5115-5127</pages><issn>0098-3063</issn><eissn>1558-4127</eissn><coden>ITCEDA</coden><abstract>Monocular-to-3D human modeling involves creating colored three-dimensional models of humans from monocular try-on images. This technology offers personalized services to consumers and has garnered considerable attention for its potential business value. However, current methods are unable to deform clothing images to align with the human body naturally. Additionally, the generation of low-quality monocular try-on images severely hinders the creation of high-precision human models. This paper presents a novel monocular-to-3D human modeling network capable of accurately generating 3D models from monocular try-on images. To improve the accuracy of clothing deformation, an enhanced non-rigid deformation constraint strategy is introduced. This strategy helps reduce excessive deformation by strengthening penalties for outliers. Additionally, occlusion is addressed by implementing strict boundary constraints, resulting in more realistic and natural deformation outcomes. Furthermore, a stepped spatial-aware block is proposed to fuse latent multi-scale shape features in person images during depth estimation. This approach allows for creating high-precision person models in a single stage, enhancing the overall quality of the generated 3D models. Experiments conducted on the MPV-3D dataset demonstrate the superiority of the method. Regarding human modeling, Abs. decreased from 7.88 to 7.38, Sq. from 0.39 to 0.34, and RMSE from 11.27 to 10.66.</abstract><pub>IEEE</pub><doi>10.1109/TCE.2024.3410989</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-7275-5064</orcidid><orcidid>https://orcid.org/0000-0002-4006-7029</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0098-3063
ispartof IEEE transactions on consumer electronics, 2024-08, Vol.70 (3), p.5115-5127
issn 0098-3063
1558-4127
language eng
recordid cdi_crossref_primary_10_1109_TCE_2024_3410989
source IEEE Electronic Library (IEL)
subjects Biological system modeling
Computational modeling
consumer technology
Deformable models
Deformation
generative adversarial network
Monocular-to-3D human modeling
Solid modeling
Three-dimensional displays
virtual try-on
title MuSAnet: Monocular-to-3-D Human Modeling via Multi-Scale Spatial Awareness
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T22%3A07%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MuSAnet:%20Monocular-to-3-D%20Human%20Modeling%20via%20Multi-Scale%20Spatial%20Awareness&rft.jtitle=IEEE%20transactions%20on%20consumer%20electronics&rft.au=Du,%20Chenghu&rft.date=2024-08&rft.volume=70&rft.issue=3&rft.spage=5115&rft.epage=5127&rft.pages=5115-5127&rft.issn=0098-3063&rft.eissn=1558-4127&rft.coden=ITCEDA&rft_id=info:doi/10.1109/TCE.2024.3410989&rft_dat=%3Ccrossref_RIE%3E10_1109_TCE_2024_3410989%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10552157&rfr_iscdi=true