A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification
Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data acro...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on image processing 2020-01, Vol.29, p.2795-2807 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2807 |
---|---|
container_issue | |
container_start_page | 2795 |
container_title | IEEE transactions on image processing |
container_volume | 29 |
creator | Yang, Fu-En Chang, Jing-Cheng Tsai, Chung-Chi Wang, Yu-Chiang Frank |
description | Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks. |
doi_str_mv | 10.1109/TIP.2019.2952707 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_8902223</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8902223</ieee_id><sourcerecordid>2317583852</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</originalsourceid><addsrcrecordid>eNpdkctLxDAQxoMouj7ugiAFL166TtKkSY-y62PBRRE9lySdLpE-dpP24H9vu7t68DTDzO_7SOYj5JLClFLI7j4Wb1MGNJuyTDAJ8oBMaMZpDMDZ4dCDkLGkPDshpyF8AVAuaHpMThIqBWWST8jmPlr2VefieVtr10S6KfaDZVvoKnrHtceATac71zbR3G37ZlWhj8rWRzPfhvArXtR6hdFSN27dVzvB6DerdAiudHY7OidHpa4CXuzrGfl8fPiYPccvr0-L2f1LbBMuu5gbhplIqVVC2SwpuGUp16lOuU25AmHQlNoobgqjpUmpgQTQpLpUVFpV2OSM3O58177d9Bi6vHbBYlXpBts-5Gw8gkqUYAN68w_9anvfDK8bKC4EBQlioGBH2fHPHst87V2t_XdOIR_jyIc48jGOfB_HILneG_emxuJP8Hv_AbjaAQ4R_9YqA8ZYkvwAanmO0g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2345510705</pqid></control><display><type>article</type><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><source>IEEE Electronic Library (IEL)</source><creator>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</creator><creatorcontrib>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</creatorcontrib><description>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2019.2952707</identifier><identifier>PMID: 31751274</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Adaptation models ; Computer vision ; Data models ; Deep learning ; domain adaptation ; Feature extraction ; Gallium nitride ; Image classification ; Image manipulation ; image translation ; Invariants ; Machine learning ; Network architecture ; Representation disentanglement ; Representations ; Task analysis</subject><ispartof>IEEE transactions on image processing, 2020-01, Vol.29, p.2795-2807</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</citedby><cites>FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</cites><orcidid>0000-0002-8495-9053 ; 0000-0002-2333-157X ; 0000-0003-1792-9978 ; 0000-0003-0102-7101</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8902223$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8902223$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31751274$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Fu-En</creatorcontrib><creatorcontrib>Chang, Jing-Cheng</creatorcontrib><creatorcontrib>Tsai, Chung-Chi</creatorcontrib><creatorcontrib>Wang, Yu-Chiang Frank</creatorcontrib><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</description><subject>Adaptation models</subject><subject>Computer vision</subject><subject>Data models</subject><subject>Deep learning</subject><subject>domain adaptation</subject><subject>Feature extraction</subject><subject>Gallium nitride</subject><subject>Image classification</subject><subject>Image manipulation</subject><subject>image translation</subject><subject>Invariants</subject><subject>Machine learning</subject><subject>Network architecture</subject><subject>Representation disentanglement</subject><subject>Representations</subject><subject>Task analysis</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkctLxDAQxoMouj7ugiAFL166TtKkSY-y62PBRRE9lySdLpE-dpP24H9vu7t68DTDzO_7SOYj5JLClFLI7j4Wb1MGNJuyTDAJ8oBMaMZpDMDZ4dCDkLGkPDshpyF8AVAuaHpMThIqBWWST8jmPlr2VefieVtr10S6KfaDZVvoKnrHtceATac71zbR3G37ZlWhj8rWRzPfhvArXtR6hdFSN27dVzvB6DerdAiudHY7OidHpa4CXuzrGfl8fPiYPccvr0-L2f1LbBMuu5gbhplIqVVC2SwpuGUp16lOuU25AmHQlNoobgqjpUmpgQTQpLpUVFpV2OSM3O58177d9Bi6vHbBYlXpBts-5Gw8gkqUYAN68w_9anvfDK8bKC4EBQlioGBH2fHPHst87V2t_XdOIR_jyIc48jGOfB_HILneG_emxuJP8Hv_AbjaAQ4R_9YqA8ZYkvwAanmO0g</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Yang, Fu-En</creator><creator>Chang, Jing-Cheng</creator><creator>Tsai, Chung-Chi</creator><creator>Wang, Yu-Chiang Frank</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8495-9053</orcidid><orcidid>https://orcid.org/0000-0002-2333-157X</orcidid><orcidid>https://orcid.org/0000-0003-1792-9978</orcidid><orcidid>https://orcid.org/0000-0003-0102-7101</orcidid></search><sort><creationdate>20200101</creationdate><title>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</title><author>Yang, Fu-En ; Chang, Jing-Cheng ; Tsai, Chung-Chi ; Wang, Yu-Chiang Frank</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-4b2e9561c858c93d4c264a6a64c64805bebfab84bdba7b61b030eb6af817c8dc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adaptation models</topic><topic>Computer vision</topic><topic>Data models</topic><topic>Deep learning</topic><topic>domain adaptation</topic><topic>Feature extraction</topic><topic>Gallium nitride</topic><topic>Image classification</topic><topic>Image manipulation</topic><topic>image translation</topic><topic>Invariants</topic><topic>Machine learning</topic><topic>Network architecture</topic><topic>Representation disentanglement</topic><topic>Representations</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Fu-En</creatorcontrib><creatorcontrib>Chang, Jing-Cheng</creatorcontrib><creatorcontrib>Tsai, Chung-Chi</creatorcontrib><creatorcontrib>Wang, Yu-Chiang Frank</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Fu-En</au><au>Chang, Jing-Cheng</au><au>Tsai, Chung-Chi</au><au>Wang, Yu-Chiang Frank</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>29</volume><spage>2795</spage><epage>2807</epage><pages>2795-2807</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Learning interpretable data representation has been an active research topic in deep learning and computer vision. While representation disentanglement is an effective technique for addressing this task, existing works cannot easily handle the problems in which manipulating and recognizing data across multiple domains are desirable. In this paper, we present a unified network architecture of Multi-domain and Multi-modal Representation Disentangler (M 2 RD), with the goal of learning domain-invariant content representation with the associated domain-specific representation observed. By advancing adversarial learning and disentanglement techniques, the proposed model is able to perform continuous image manipulation across data domains with multiple modalities. More importantly, the resulting domain-invariant feature representation can be applied for unsupervised domain adaptation. Finally, our quantitative and qualitative results would confirm the effectiveness and robustness of the proposed model over state-of-the-art methods on the above tasks.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31751274</pmid><doi>10.1109/TIP.2019.2952707</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8495-9053</orcidid><orcidid>https://orcid.org/0000-0002-2333-157X</orcidid><orcidid>https://orcid.org/0000-0003-1792-9978</orcidid><orcidid>https://orcid.org/0000-0003-0102-7101</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2020-01, Vol.29, p.2795-2807 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_ieee_primary_8902223 |
source | IEEE Electronic Library (IEL) |
subjects | Adaptation models Computer vision Data models Deep learning domain adaptation Feature extraction Gallium nitride Image classification Image manipulation image translation Invariants Machine learning Network architecture Representation disentanglement Representations Task analysis |
title | A Multi-Domain and Multi-Modal Representation Disentangler for Cross-Domain Image Manipulation and Classification |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A52%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Multi-Domain%20and%20Multi-Modal%20Representation%20Disentangler%20for%20Cross-Domain%20Image%20Manipulation%20and%20Classification&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Yang,%20Fu-En&rft.date=2020-01-01&rft.volume=29&rft.spage=2795&rft.epage=2807&rft.pages=2795-2807&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2019.2952707&rft_dat=%3Cproquest_RIE%3E2317583852%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2345510705&rft_id=info:pmid/31751274&rft_ieee_id=8902223&rfr_iscdi=true |