GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled ta...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-11 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Lee, Hyogun Bae, Kyungho Seong Jong Ha Ko, Yumin Park, Gyeong-Moon Choi, Jinwoo |
description | In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2892799340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2892799340</sourcerecordid><originalsourceid>FETCH-proquest_journals_28927993403</originalsourceid><addsrcrecordid>eNqNi7GOwjAQBa2TkEAc_7ASdaRgw0HoAoFQpIRr0UKWYM54g-2A-HtScD3VPGnefImeVGoUzcZSdsXA-0scx_JnKicT1RPPvEizOeSGD2iigo9o4FfTA1KjK3slGwBtCQs8_lWOm3ZmdNDota3gxA521jc1ubv2VLZhSQwZX1FbSEusAwbNFh46nKFAV9G_zLH-Fp0TGk-DN_tiuF5tl5uodnxryIf9hRtnW7WXs0ROk0SNY_XZ6wW4hEy1</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2892799340</pqid></control><display><type>article</type><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><source>Free E- Journals</source><creator>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</creator><creatorcontrib>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</creatorcontrib><description>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adaptation ; Alignment ; Kinetics ; Representations</subject><ispartof>arXiv.org, 2023-11</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Lee, Hyogun</creatorcontrib><creatorcontrib>Bae, Kyungho</creatorcontrib><creatorcontrib>Seong Jong Ha</creatorcontrib><creatorcontrib>Ko, Yumin</creatorcontrib><creatorcontrib>Park, Gyeong-Moon</creatorcontrib><creatorcontrib>Choi, Jinwoo</creatorcontrib><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><title>arXiv.org</title><description>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</description><subject>Adaptation</subject><subject>Alignment</subject><subject>Kinetics</subject><subject>Representations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi7GOwjAQBa2TkEAc_7ASdaRgw0HoAoFQpIRr0UKWYM54g-2A-HtScD3VPGnefImeVGoUzcZSdsXA-0scx_JnKicT1RPPvEizOeSGD2iigo9o4FfTA1KjK3slGwBtCQs8_lWOm3ZmdNDota3gxA521jc1ubv2VLZhSQwZX1FbSEusAwbNFh46nKFAV9G_zLH-Fp0TGk-DN_tiuF5tl5uodnxryIf9hRtnW7WXs0ROk0SNY_XZ6wW4hEy1</recordid><startdate>20231122</startdate><enddate>20231122</enddate><creator>Lee, Hyogun</creator><creator>Bae, Kyungho</creator><creator>Seong Jong Ha</creator><creator>Ko, Yumin</creator><creator>Park, Gyeong-Moon</creator><creator>Choi, Jinwoo</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231122</creationdate><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><author>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28927993403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Alignment</topic><topic>Kinetics</topic><topic>Representations</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, Hyogun</creatorcontrib><creatorcontrib>Bae, Kyungho</creatorcontrib><creatorcontrib>Seong Jong Ha</creatorcontrib><creatorcontrib>Ko, Yumin</creatorcontrib><creatorcontrib>Park, Gyeong-Moon</creatorcontrib><creatorcontrib>Choi, Jinwoo</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, Hyogun</au><au>Bae, Kyungho</au><au>Seong Jong Ha</au><au>Ko, Yumin</au><au>Park, Gyeong-Moon</au><au>Choi, Jinwoo</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</atitle><jtitle>arXiv.org</jtitle><date>2023-11-22</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2892799340 |
source | Free E- Journals |
subjects | Adaptation Alignment Kinetics Representations |
title | GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T00%3A52%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=GLAD:%20Global-Local%20View%20Alignment%20and%20Background%20Debiasing%20for%20Unsupervised%20Video%20Domain%20Adaptation%20with%20Large%20Domain%20Gap&rft.jtitle=arXiv.org&rft.au=Lee,%20Hyogun&rft.date=2023-11-22&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2892799340%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2892799340&rft_id=info:pmid/&rfr_iscdi=true |