Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation

One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial varia...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kwon, Gihyun, Ye, Jong Chul
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Kwon, Gihyun Ye, Jong Chul
description	One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.
doi_str_mv	10.48550/arxiv.2103.16146
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2103_16146</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2103_16146</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-58f9591db1e2495a9ca47b7795afc9590fb3c8cf75dcf762a60b3d82a1a0484c3</originalsourceid><addsrcrecordid>eNo9j81OhDAUhbtxYUYfwJV9AbClP8CSMIqTTHQhe3JLW9KkFFOIcd7eDho395yTc3OSD6EHSnJeCUGeIH67r7yghOVUUi5vkT86mJYAHjfbZsLmloAhaPyxXbzJFKxG4655w3aJuF3C9SXbO3x0awoQJm_mZLAL-DTDZHBngonwv9RHCKvf8x26seBXc_-nB9S_PPfta3Z-705tc85AljITla1FTbWipuC1gHoEXqqyTNaOqSFWsbEabSl0OrIASRTTVQEUCK_4yA7o8Xd2xx0-o5shXoYr9rBjsx8Y-1PS</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><source>arXiv.org</source><creator>Kwon, Gihyun ; Ye, Jong Chul</creator><creatorcontrib>Kwon, Gihyun ; Ye, Jong Chul</creatorcontrib><description>One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.</description><identifier>DOI: 10.48550/arxiv.2103.16146</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2021-03</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2103.16146$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2103.16146$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kwon, Gihyun</creatorcontrib><creatorcontrib>Ye, Jong Chul</creatorcontrib><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><description>One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9j81OhDAUhbtxYUYfwJV9AbClP8CSMIqTTHQhe3JLW9KkFFOIcd7eDho395yTc3OSD6EHSnJeCUGeIH67r7yghOVUUi5vkT86mJYAHjfbZsLmloAhaPyxXbzJFKxG4655w3aJuF3C9SXbO3x0awoQJm_mZLAL-DTDZHBngonwv9RHCKvf8x26seBXc_-nB9S_PPfta3Z-705tc85AljITla1FTbWipuC1gHoEXqqyTNaOqSFWsbEabSl0OrIASRTTVQEUCK_4yA7o8Xd2xx0-o5shXoYr9rBjsx8Y-1PS</recordid><startdate>20210330</startdate><enddate>20210330</enddate><creator>Kwon, Gihyun</creator><creator>Ye, Jong Chul</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20210330</creationdate><title>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</title><author>Kwon, Gihyun ; Ye, Jong Chul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-58f9591db1e2495a9ca47b7795afc9590fb3c8cf75dcf762a60b3d82a1a0484c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Kwon, Gihyun</creatorcontrib><creatorcontrib>Ye, Jong Chul</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kwon, Gihyun</au><au>Ye, Jong Chul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation</atitle><date>2021-03-30</date><risdate>2021</risdate><abstract>One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.</abstract><doi>10.48550/arxiv.2103.16146</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2103.16146
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2103_16146
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T21%3A29%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Diagonal%20Attention%20and%20Style-based%20GAN%20for%20Content-Style%20Disentanglement%20in%20Image%20Generation%20and%20Translation&rft.au=Kwon,%20Gihyun&rft.date=2021-03-30&rft_id=info:doi/10.48550/arxiv.2103.16146&rft_dat=%3Carxiv_GOX%3E2103_16146%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true