PFDM: Parser-Free Virtual Try-on via Diffusion Model

Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are ofte...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-02
Hauptverfasser: Niu, Yunfang, Dong, Yi, Wu, Lingxiang, Liu, Zhiwei, Cai, Pengxiang, Wang, Jinqiao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Niu, Yunfang
Dong, Yi
Wu, Lingxiang
Liu, Zhiwei
Cai, Pengxiang
Wang, Jinqiao
description Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2922661821</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2922661821</sourcerecordid><originalsourceid>FETCH-proquest_journals_29226618213</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCXBz8bVSCEgsKk4t0nUrSk1VCMssKilNzFEIKarUzc9TKMtMVHDJTEsrLc4E8nzzU1JzeBhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjSyMjMzNDCyNCYOFUArvUzTA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2922661821</pqid></control><display><type>article</type><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><source>Freely Accessible Journals</source><creator>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</creator><creatorcontrib>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</creatorcontrib><description>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Computer vision ; Garments ; Parsers ; Synthesis</subject><ispartof>arXiv.org, 2024-02</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Niu, Yunfang</creatorcontrib><creatorcontrib>Dong, Yi</creatorcontrib><creatorcontrib>Wu, Lingxiang</creatorcontrib><creatorcontrib>Liu, Zhiwei</creatorcontrib><creatorcontrib>Cai, Pengxiang</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><title>arXiv.org</title><description>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</description><subject>Accuracy</subject><subject>Computer vision</subject><subject>Garments</subject><subject>Parsers</subject><subject>Synthesis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCXBz8bVSCEgsKk4t0nUrSk1VCMssKilNzFEIKarUzc9TKMtMVHDJTEsrLc4E8nzzU1JzeBhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjSyMjMzNDCyNCYOFUArvUzTA</recordid><startdate>20240205</startdate><enddate>20240205</enddate><creator>Niu, Yunfang</creator><creator>Dong, Yi</creator><creator>Wu, Lingxiang</creator><creator>Liu, Zhiwei</creator><creator>Cai, Pengxiang</creator><creator>Wang, Jinqiao</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240205</creationdate><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><author>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29226618213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Computer vision</topic><topic>Garments</topic><topic>Parsers</topic><topic>Synthesis</topic><toplevel>online_resources</toplevel><creatorcontrib>Niu, Yunfang</creatorcontrib><creatorcontrib>Dong, Yi</creatorcontrib><creatorcontrib>Wu, Lingxiang</creatorcontrib><creatorcontrib>Liu, Zhiwei</creatorcontrib><creatorcontrib>Cai, Pengxiang</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Yunfang</au><au>Dong, Yi</au><au>Wu, Lingxiang</au><au>Liu, Zhiwei</au><au>Cai, Pengxiang</au><au>Wang, Jinqiao</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>PFDM: Parser-Free Virtual Try-on via Diffusion Model</atitle><jtitle>arXiv.org</jtitle><date>2024-02-05</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-02
issn 2331-8422
language eng
recordid cdi_proquest_journals_2922661821
source Freely Accessible Journals
subjects Accuracy
Computer vision
Garments
Parsers
Synthesis
title PFDM: Parser-Free Virtual Try-on via Diffusion Model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T03%3A13%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=PFDM:%20Parser-Free%20Virtual%20Try-on%20via%20Diffusion%20Model&rft.jtitle=arXiv.org&rft.au=Niu,%20Yunfang&rft.date=2024-02-05&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2922661821%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2922661821&rft_id=info:pmid/&rfr_iscdi=true