FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping
This work proposes a novel face-swapping framework FlowFace++, utilizing explicit semantic flow supervision and end-to-end architecture to facilitate shape-aware face-swapping. Specifically, our work pretrains a facial shape discriminator to supervise the face swapping network. The discriminator is...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Zhang, Yu Zeng, Hao Ma, Bowen Zhang, Wei Zhang, Zhimeng Ding, Yu Lv, Tangjie Fan, Changjie |
description | This work proposes a novel face-swapping framework FlowFace++, utilizing
explicit semantic flow supervision and end-to-end architecture to facilitate
shape-aware face-swapping. Specifically, our work pretrains a facial shape
discriminator to supervise the face swapping network. The discriminator is
shape-aware and relies on a semantic flow-guided operation to explicitly
calculate the shape discrepancies between the target and source faces, thus
optimizing the face swapping network to generate highly realistic results. The
face swapping network is a stack of a pre-trained face-masked autoencoder
(MAE), a cross-attention fusion module, and a convolutional decoder. The MAE
provides a fine-grained facial image representation space, which is unified for
the target and source faces and thus facilitates final realistic results. The
cross-attention fusion module carries out the source-to-target face swapping in
a fine-grained latent space while preserving other attributes of the target
image (e.g. expression, head pose, hair, background, illumination, etc).
Lastly, the convolutional decoder further synthesizes the swapping results
according to the face-swapping latent embedding from the cross-attention fusion
module. Extensive quantitative and qualitative experiments on in-the-wild faces
demonstrate that our FlowFace++ outperforms the state-of-the-art significantly,
particularly while the source face is obstructed by uneven lighting or angle
offset. |
doi_str_mv | 10.48550/arxiv.2306.12686 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2306_12686</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2306_12686</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-2daa7d858088bfe7f88393c80c6f79ba9121a45ec30b94fc44d54e224263f6c43</originalsourceid><addsrcrecordid>eNotjztPwzAURr0woNIfwIT3yqnjV27YUJUAUiWGdo9u_ECW0tRKQh__vqQwneHTd6RDyHPOMwVa8zUOl3jKhOQmy4UB80jqujuea7R-tXql1SV10caJ7vwB-ylaOq9s_El-OMXRO1r1jk1H9gs6n-jujCnF_vuJPATsRr_854Ls62q_-WDbr_fPzduWoSkMEw6xcKCBA7TBFwFAltICtyYUZYtlLnJU2lvJ21IFq5TTyguhhJHBWCUX5OVPew9p0hAPOFybOai5B8kbORhEuA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping</title><source>arXiv.org</source><creator>Zhang, Yu ; Zeng, Hao ; Ma, Bowen ; Zhang, Wei ; Zhang, Zhimeng ; Ding, Yu ; Lv, Tangjie ; Fan, Changjie</creator><creatorcontrib>Zhang, Yu ; Zeng, Hao ; Ma, Bowen ; Zhang, Wei ; Zhang, Zhimeng ; Ding, Yu ; Lv, Tangjie ; Fan, Changjie</creatorcontrib><description>This work proposes a novel face-swapping framework FlowFace++, utilizing
explicit semantic flow supervision and end-to-end architecture to facilitate
shape-aware face-swapping. Specifically, our work pretrains a facial shape
discriminator to supervise the face swapping network. The discriminator is
shape-aware and relies on a semantic flow-guided operation to explicitly
calculate the shape discrepancies between the target and source faces, thus
optimizing the face swapping network to generate highly realistic results. The
face swapping network is a stack of a pre-trained face-masked autoencoder
(MAE), a cross-attention fusion module, and a convolutional decoder. The MAE
provides a fine-grained facial image representation space, which is unified for
the target and source faces and thus facilitates final realistic results. The
cross-attention fusion module carries out the source-to-target face swapping in
a fine-grained latent space while preserving other attributes of the target
image (e.g. expression, head pose, hair, background, illumination, etc).
Lastly, the convolutional decoder further synthesizes the swapping results
according to the face-swapping latent embedding from the cross-attention fusion
module. Extensive quantitative and qualitative experiments on in-the-wild faces
demonstrate that our FlowFace++ outperforms the state-of-the-art significantly,
particularly while the source face is obstructed by uneven lighting or angle
offset.</description><identifier>DOI: 10.48550/arxiv.2306.12686</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2306.12686$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2306.12686$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Zeng, Hao</creatorcontrib><creatorcontrib>Ma, Bowen</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhang, Zhimeng</creatorcontrib><creatorcontrib>Ding, Yu</creatorcontrib><creatorcontrib>Lv, Tangjie</creatorcontrib><creatorcontrib>Fan, Changjie</creatorcontrib><title>FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping</title><description>This work proposes a novel face-swapping framework FlowFace++, utilizing
explicit semantic flow supervision and end-to-end architecture to facilitate
shape-aware face-swapping. Specifically, our work pretrains a facial shape
discriminator to supervise the face swapping network. The discriminator is
shape-aware and relies on a semantic flow-guided operation to explicitly
calculate the shape discrepancies between the target and source faces, thus
optimizing the face swapping network to generate highly realistic results. The
face swapping network is a stack of a pre-trained face-masked autoencoder
(MAE), a cross-attention fusion module, and a convolutional decoder. The MAE
provides a fine-grained facial image representation space, which is unified for
the target and source faces and thus facilitates final realistic results. The
cross-attention fusion module carries out the source-to-target face swapping in
a fine-grained latent space while preserving other attributes of the target
image (e.g. expression, head pose, hair, background, illumination, etc).
Lastly, the convolutional decoder further synthesizes the swapping results
according to the face-swapping latent embedding from the cross-attention fusion
module. Extensive quantitative and qualitative experiments on in-the-wild faces
demonstrate that our FlowFace++ outperforms the state-of-the-art significantly,
particularly while the source face is obstructed by uneven lighting or angle
offset.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjztPwzAURr0woNIfwIT3yqnjV27YUJUAUiWGdo9u_ECW0tRKQh__vqQwneHTd6RDyHPOMwVa8zUOl3jKhOQmy4UB80jqujuea7R-tXql1SV10caJ7vwB-ylaOq9s_El-OMXRO1r1jk1H9gs6n-jujCnF_vuJPATsRr_854Ls62q_-WDbr_fPzduWoSkMEw6xcKCBA7TBFwFAltICtyYUZYtlLnJU2lvJ21IFq5TTyguhhJHBWCUX5OVPew9p0hAPOFybOai5B8kbORhEuA</recordid><startdate>20230622</startdate><enddate>20230622</enddate><creator>Zhang, Yu</creator><creator>Zeng, Hao</creator><creator>Ma, Bowen</creator><creator>Zhang, Wei</creator><creator>Zhang, Zhimeng</creator><creator>Ding, Yu</creator><creator>Lv, Tangjie</creator><creator>Fan, Changjie</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230622</creationdate><title>FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping</title><author>Zhang, Yu ; Zeng, Hao ; Ma, Bowen ; Zhang, Wei ; Zhang, Zhimeng ; Ding, Yu ; Lv, Tangjie ; Fan, Changjie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-2daa7d858088bfe7f88393c80c6f79ba9121a45ec30b94fc44d54e224263f6c43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Yu</creatorcontrib><creatorcontrib>Zeng, Hao</creatorcontrib><creatorcontrib>Ma, Bowen</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Zhang, Zhimeng</creatorcontrib><creatorcontrib>Ding, Yu</creatorcontrib><creatorcontrib>Lv, Tangjie</creatorcontrib><creatorcontrib>Fan, Changjie</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Yu</au><au>Zeng, Hao</au><au>Ma, Bowen</au><au>Zhang, Wei</au><au>Zhang, Zhimeng</au><au>Ding, Yu</au><au>Lv, Tangjie</au><au>Fan, Changjie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping</atitle><date>2023-06-22</date><risdate>2023</risdate><abstract>This work proposes a novel face-swapping framework FlowFace++, utilizing
explicit semantic flow supervision and end-to-end architecture to facilitate
shape-aware face-swapping. Specifically, our work pretrains a facial shape
discriminator to supervise the face swapping network. The discriminator is
shape-aware and relies on a semantic flow-guided operation to explicitly
calculate the shape discrepancies between the target and source faces, thus
optimizing the face swapping network to generate highly realistic results. The
face swapping network is a stack of a pre-trained face-masked autoencoder
(MAE), a cross-attention fusion module, and a convolutional decoder. The MAE
provides a fine-grained facial image representation space, which is unified for
the target and source faces and thus facilitates final realistic results. The
cross-attention fusion module carries out the source-to-target face swapping in
a fine-grained latent space while preserving other attributes of the target
image (e.g. expression, head pose, hair, background, illumination, etc).
Lastly, the convolutional decoder further synthesizes the swapping results
according to the face-swapping latent embedding from the cross-attention fusion
module. Extensive quantitative and qualitative experiments on in-the-wild faces
demonstrate that our FlowFace++ outperforms the state-of-the-art significantly,
particularly while the source face is obstructed by uneven lighting or angle
offset.</abstract><doi>10.48550/arxiv.2306.12686</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2306.12686 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2306_12686 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T13%3A09%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FlowFace++:%20Explicit%20Semantic%20Flow-supervised%20End-to-End%20Face%20Swapping&rft.au=Zhang,%20Yu&rft.date=2023-06-22&rft_id=info:doi/10.48550/arxiv.2306.12686&rft_dat=%3Carxiv_GOX%3E2306_12686%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |