A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification

Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data acro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2020-01, Vol.29, p.2795-2807
Hauptverfasser: Yang, Fu-En, Chang, Jing-Cheng, Tsai, Chung-Chi, Wang, Yu-Chiang Frank
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2807
container_issue
container_start_page 2795
container_title IEEE transactions on image processing
container_volume 29
creator Yang, Fu-En
Chang, Jing-Cheng
Tsai, Chung-Chi
Wang, Yu-Chiang Frank
description Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.
doi_str_mv 10.1109/TIP.2019.2952707
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8902223</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8902223</ieee_id><sourcerecordid>2317583852</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</originalsourceid><addsrcrecordid>eNpdkctLxDAQxoMouj7ugiAFL166TtKkSY-y62PBRRE9lySdLpE-dpP24H9vu7t68DTDzO_7SOYj5JLClFLI7j4Wb1MGNJuyTDAJ8oBMaMZpDMDZ4dCDkLGkPDshpyF8AVAuaHpMThIqBWWST8jmPlr2VefieVtr10S6KfaDZVvoKnrHtceATac71zbR3G37ZlWhj8rWRzPfhvArXtR6hdFSN27dVzvB6DerdAiudHY7OidHpa4CXuzrGfl8fPiYPccvr0-L2f1LbBMuu5gbhplIqVVC2SwpuGUp16lOuU25AmHQlNoobgqjpUmpgQTQpLpUVFpV2OSM3O58177d9Bi6vHbBYlXpBts-5Gw8gkqUYAN68w_9anvfDK8bKC4EBQlioGBH2fHPHst87V2t_XdOIR_jyIc48jGOfB_HILneG_emxuJP8Hv_AbjaAQ4R_9YqA8ZYkvwAanmO0g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2345510705</pqid></control><display><type>article</type><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</creator><creatorcontrib>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</creatorcontrib><description>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2019.2952707</identifier><identifier>PMID: 31751274</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Adaptation models ; Computer vision ; Data models ; Deep learning ; domain adaptation ; Feature extraction ; Gallium nitride ; Image classification ; Image manipulation ; image translation ; Invariants ; Machine learning ; Network architecture ; Representation disentanglement ; Representations ; Task analysis</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.2795-2807</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</citedby><cites>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</cites><orcidid>0000-0002-8495-9053 ; 0000-0002-2333-157X ; 0000-0003-1792-9978 ; 0000-0003-0102-7101</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8902223$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8902223$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31751274$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Fu-En</creatorcontrib><creatorcontrib>Chang, Jing-Cheng</creatorcontrib><creatorcontrib>Tsai, Chung-Chi</creatorcontrib><creatorcontrib>Wang, Yu-Chiang Frank</creatorcontrib><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</description><subject>Adaptation models</subject><subject>Computer vision</subject><subject>Data models</subject><subject>Deep learning</subject><subject>domain adaptation</subject><subject>Feature extraction</subject><subject>Gallium nitride</subject><subject>Image classification</subject><subject>Image manipulation</subject><subject>image translation</subject><subject>Invariants</subject><subject>Machine learning</subject><subject>Network architecture</subject><subject>Representation disentanglement</subject><subject>Representations</subject><subject>Task analysis</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkctLxDAQxoMouj7ugiAFL166TtKkSY-y62PBRRE9lySdLpE-dpP24H9vu7t68DTDzO_7SOYj5JLClFLI7j4Wb1MGNJuyTDAJ8oBMaMZpDMDZ4dCDkLGkPDshpyF8AVAuaHpMThIqBWWST8jmPlr2VefieVtr10S6KfaDZVvoKnrHtceATac71zbR3G37ZlWhj8rWRzPfhvArXtR6hdFSN27dVzvB6DerdAiudHY7OidHpa4CXuzrGfl8fPiYPccvr0-L2f1LbBMuu5gbhplIqVVC2SwpuGUp16lOuU25AmHQlNoobgqjpUmpgQTQpLpUVFpV2OSM3O58177d9Bi6vHbBYlXpBts-5Gw8gkqUYAN68w_9anvfDK8bKC4EBQlioGBH2fHPHst87V2t_XdOIR_jyIc48jGOfB_HILneG_emxuJP8Hv_AbjaAQ4R_9YqA8ZYkvwAanmO0g</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Yang, Fu-En</creator><creator>Chang, Jing-Cheng</creator><creator>Tsai, Chung-Chi</creator><creator>Wang, Yu-Chiang Frank</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8495-9053</orcidid><orcidid>https://orcid.org/0000-0002-2333-157X</orcidid><orcidid>https://orcid.org/0000-0003-1792-9978</orcidid><orcidid>https://orcid.org/0000-0003-0102-7101</orcidid></search><sort><creationdate>20200101</creationdate><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><author>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adaptation models</topic><topic>Computer vision</topic><topic>Data models</topic><topic>Deep learning</topic><topic>domain adaptation</topic><topic>Feature extraction</topic><topic>Gallium nitride</topic><topic>Image classification</topic><topic>Image manipulation</topic><topic>image translation</topic><topic>Invariants</topic><topic>Machine learning</topic><topic>Network architecture</topic><topic>Representation disentanglement</topic><topic>Representations</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Fu-En</creatorcontrib><creatorcontrib>Chang, Jing-Cheng</creatorcontrib><creatorcontrib>Tsai, Chung-Chi</creatorcontrib><creatorcontrib>Wang, Yu-Chiang Frank</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Fu-En</au><au>Chang, Jing-Cheng</au><au>Tsai, Chung-Chi</au><au>Wang, Yu-Chiang Frank</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>2795</spage><epage>2807</epage><pages>2795-2807</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31751274</pmid><doi>10.1109/TIP.2019.2952707</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8495-9053</orcidid><orcidid>https://orcid.org/0000-0002-2333-157X</orcidid><orcidid>https://orcid.org/0000-0003-1792-9978</orcidid><orcidid>https://orcid.org/0000-0003-0102-7101</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2020-01, Vol.29, p.2795-2807
issn 1057-7149
1941-0042
language eng
recordid cdi_ieee_primary_8902223
source IEEE Electronic Library (IEL)
subjects Adaptation models
Computer vision
Data models
Deep learning
domain adaptation
Feature extraction
Gallium nitride
Image classification
Image manipulation
image translation
Invariants
Machine learning
Network architecture
Representation disentanglement
Representations
Task analysis
title A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A52%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Multi-Domain%20and%20Multi-Modal%20Representation%20Disentangler%20for%20Cross-Domain%20Image%20Manipulation%20and%20Classification&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Yang,%20Fu-En&rft.date=2020-01-01&rft.volume=29&rft.spage=2795&rft.epage=2807&rft.pages=2795-2807&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2019.2952707&rft_dat=%3Cproquest_RIE%3E2317583852%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2345510705&rft_id=info:pmid/31751274&rft_ieee_id=8902223&rfr_iscdi=true