PFDM: Parser-Free Virtual Try-on via Diffusion Model

Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are ofte...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-02
Hauptverfasser:	Niu, Yunfang, Dong, Yi, Wu, Lingxiang, Liu, Zhiwei, Cai, Pengxiang, Wang, Jinqiao
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Computer vision Garments Parsers Synthesis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Niu, Yunfang Dong, Yi Wu, Lingxiang Liu, Zhiwei Cai, Pengxiang Wang, Jinqiao
description	Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2922661821</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2922661821</sourcerecordid><originalsourceid>FETCH-proquest_journals_29226618213</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCXBz8bVSCEgsKk4t0nUrSk1VCMssKilNzFEIKarUzc9TKMtMVHDJTEsrLc4E8nzzU1JzeBhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjSyMjMzNDCyNCYOFUArvUzTA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2922661821</pqid></control><display><type>article</type><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><source>Freely Accessible Journals</source><creator>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</creator><creatorcontrib>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</creatorcontrib><description>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Computer vision ; Garments ; Parsers ; Synthesis</subject><ispartof>arXiv.org, 2024-02</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Niu, Yunfang</creatorcontrib><creatorcontrib>Dong, Yi</creatorcontrib><creatorcontrib>Wu, Lingxiang</creatorcontrib><creatorcontrib>Liu, Zhiwei</creatorcontrib><creatorcontrib>Cai, Pengxiang</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><title>arXiv.org</title><description>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</description><subject>Accuracy</subject><subject>Computer vision</subject><subject>Garments</subject><subject>Parsers</subject><subject>Synthesis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwCXBz8bVSCEgsKk4t0nUrSk1VCMssKilNzFEIKarUzc9TKMtMVHDJTEsrLc4E8nzzU1JzeBhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjSyMjMzNDCyNCYOFUArvUzTA</recordid><startdate>20240205</startdate><enddate>20240205</enddate><creator>Niu, Yunfang</creator><creator>Dong, Yi</creator><creator>Wu, Lingxiang</creator><creator>Liu, Zhiwei</creator><creator>Cai, Pengxiang</creator><creator>Wang, Jinqiao</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240205</creationdate><title>PFDM: Parser-Free Virtual Try-on via Diffusion Model</title><author>Niu, Yunfang ; Dong, Yi ; Wu, Lingxiang ; Liu, Zhiwei ; Cai, Pengxiang ; Wang, Jinqiao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29226618213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Computer vision</topic><topic>Garments</topic><topic>Parsers</topic><topic>Synthesis</topic><toplevel>online_resources</toplevel><creatorcontrib>Niu, Yunfang</creatorcontrib><creatorcontrib>Dong, Yi</creatorcontrib><creatorcontrib>Wu, Lingxiang</creatorcontrib><creatorcontrib>Liu, Zhiwei</creatorcontrib><creatorcontrib>Cai, Pengxiang</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Yunfang</au><au>Dong, Yi</au><au>Wu, Lingxiang</au><au>Liu, Zhiwei</au><au>Cai, Pengxiang</au><au>Wang, Jinqiao</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>PFDM: Parser-Free Virtual Try-on via Diffusion Model</atitle><jtitle>arXiv.org</jtitle><date>2024-02-05</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-02
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2922661821
source	Freely Accessible Journals
subjects	Accuracy Computer vision Garments Parsers Synthesis
title	PFDM: Parser-Free Virtual Try-on via Diffusion Model
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T03%3A13%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=PFDM:%20Parser-Free%20Virtual%20Try-on%20via%20Diffusion%20Model&rft.jtitle=arXiv.org&rft.au=Niu,%20Yunfang&rft.date=2024-02-05&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2922661821%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2922661821&rft_id=info:pmid/&rfr_iscdi=true