Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation
One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial varia...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Kwon, Gihyun Ye, Jong Chul |
description | One of the important research topics in image generative models is to
disentangle the spatial contents and styles for their separate control.
Although StyleGAN can generate content feature vectors from random noises, the
resulting spatial content control is primarily intended for minor spatial
variations, and the disentanglement of global content and styles is by no means
complete. Inspired by a mathematical understanding of normalization and
attention, here we present a novel hierarchical adaptive Diagonal spatial
ATtention (DAT) layers to separately manipulate the spatial contents from
styles in a hierarchical manner. Using DAT and AdaIN, our method enables
coarse-to-fine level disentanglement of spatial contents and styles. In
addition, our generator can be easily integrated into the GAN inversion
framework so that the content and style of translated images from multi-domain
image translation tasks can be flexibly controlled. By using various datasets,
we confirm that the proposed method not only outperforms the existing models in
disentanglement scores, but also provides more flexible control over spatial
features in the generated images. |
doi_str_mv | 10.48550/arxiv.2103.16146 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2103_16146</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2103_16146</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-58f9591db1e2495a9ca47b7795afc9590fb3c8cf75dcf762a60b3d82a1a0484c3</originalsourceid><addsrcrecordid>eNo9j81OhDAUhbtxYUYfwJV9AbClP8CSMIqTTHQhe3JLW9KkFFOIcd7eDho395yTc3OSD6EHSnJeCUGeIH67r7yghOVUUi5vkT86mJYAHjfbZsLmloAhaPyxXbzJFKxG4655w3aJuF3C9SXbO3x0awoQJm_mZLAL-DTDZHBngonwv9RHCKvf8x26seBXc_-nB9S_PPfta3Z-705tc85AljITla1FTbWipuC1gHoEXqqyTNaOqSFWsbEabSl0OrIASRTTVQEUCK_4yA7o8Xd2xx0-o5shXoYr9rBjsx8Y-1PS</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><source>arXiv.org</source><creator>Kwon, Gihyun ; Ye, Jong Chul</creator><creatorcontrib>Kwon, Gihyun ; Ye, Jong Chul</creatorcontrib><description>One of the important research topics in image generative models is to
disentangle the spatial contents and styles for their separate control.
Although StyleGAN can generate content feature vectors from random noises, the
resulting spatial content control is primarily intended for minor spatial
variations, and the disentanglement of global content and styles is by no means
complete. Inspired by a mathematical understanding of normalization and
attention, here we present a novel hierarchical adaptive Diagonal spatial
ATtention (DAT) layers to separately manipulate the spatial contents from
styles in a hierarchical manner. Using DAT and AdaIN, our method enables
coarse-to-fine level disentanglement of spatial contents and styles. In
addition, our generator can be easily integrated into the GAN inversion
framework so that the content and style of translated images from multi-domain
image translation tasks can be flexibly controlled. By using various datasets,
we confirm that the proposed method not only outperforms the existing models in
disentanglement scores, but also provides more flexible control over spatial
features in the generated images.</description><identifier>DOI: 10.48550/arxiv.2103.16146</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2021-03</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2103.16146$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2103.16146$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kwon, Gihyun</creatorcontrib><creatorcontrib>Ye, Jong Chul</creatorcontrib><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><description>One of the important research topics in image generative models is to
disentangle the spatial contents and styles for their separate control.
Although StyleGAN can generate content feature vectors from random noises, the
resulting spatial content control is primarily intended for minor spatial
variations, and the disentanglement of global content and styles is by no means
complete. Inspired by a mathematical understanding of normalization and
attention, here we present a novel hierarchical adaptive Diagonal spatial
ATtention (DAT) layers to separately manipulate the spatial contents from
styles in a hierarchical manner. Using DAT and AdaIN, our method enables
coarse-to-fine level disentanglement of spatial contents and styles. In
addition, our generator can be easily integrated into the GAN inversion
framework so that the content and style of translated images from multi-domain
image translation tasks can be flexibly controlled. By using various datasets,
we confirm that the proposed method not only outperforms the existing models in
disentanglement scores, but also provides more flexible control over spatial
features in the generated images.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9j81OhDAUhbtxYUYfwJV9AbClP8CSMIqTTHQhe3JLW9KkFFOIcd7eDho395yTc3OSD6EHSnJeCUGeIH67r7yghOVUUi5vkT86mJYAHjfbZsLmloAhaPyxXbzJFKxG4655w3aJuF3C9SXbO3x0awoQJm_mZLAL-DTDZHBngonwv9RHCKvf8x26seBXc_-nB9S_PPfta3Z-705tc85AljITla1FTbWipuC1gHoEXqqyTNaOqSFWsbEabSl0OrIASRTTVQEUCK_4yA7o8Xd2xx0-o5shXoYr9rBjsx8Y-1PS</recordid><startdate>20210330</startdate><enddate>20210330</enddate><creator>Kwon, Gihyun</creator><creator>Ye, Jong Chul</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20210330</creationdate><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><author>Kwon, Gihyun ; Ye, Jong Chul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-58f9591db1e2495a9ca47b7795afc9590fb3c8cf75dcf762a60b3d82a1a0484c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Kwon, Gihyun</creatorcontrib><creatorcontrib>Ye, Jong Chul</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kwon, Gihyun</au><au>Ye, Jong Chul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</atitle><date>2021-03-30</date><risdate>2021</risdate><abstract>One of the important research topics in image generative models is to
disentangle the spatial contents and styles for their separate control.
Although StyleGAN can generate content feature vectors from random noises, the
resulting spatial content control is primarily intended for minor spatial
variations, and the disentanglement of global content and styles is by no means
complete. Inspired by a mathematical understanding of normalization and
attention, here we present a novel hierarchical adaptive Diagonal spatial
ATtention (DAT) layers to separately manipulate the spatial contents from
styles in a hierarchical manner. Using DAT and AdaIN, our method enables
coarse-to-fine level disentanglement of spatial contents and styles. In
addition, our generator can be easily integrated into the GAN inversion
framework so that the content and style of translated images from multi-domain
image translation tasks can be flexibly controlled. By using various datasets,
we confirm that the proposed method not only outperforms the existing models in
disentanglement scores, but also provides more flexible control over spatial
features in the generated images.</abstract><doi>10.48550/arxiv.2103.16146</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2103.16146 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2103_16146 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T21%3A29%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Diagonal%20Attention%20and%20Style-based%20GAN%20for%20Content-Style%20Disentanglement%20in%20Image%20Generation%20and%20Translation&rft.au=Kwon,%20Gihyun&rft.date=2021-03-30&rft_id=info:doi/10.48550/arxiv.2103.16146&rft_dat=%3Carxiv_GOX%3E2103_16146%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |