GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap

In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled ta...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-11
Hauptverfasser: Lee, Hyogun, Bae, Kyungho, Seong Jong Ha, Ko, Yumin, Park, Gyeong-Moon, Choi, Jinwoo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Lee, Hyogun
Bae, Kyungho
Seong Jong Ha
Ko, Yumin
Park, Gyeong-Moon
Choi, Jinwoo
description In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics->BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics->BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2892799340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2892799340</sourcerecordid><originalsourceid>FETCH-proquest_journals_28927993403</originalsourceid><addsrcrecordid>eNqNi7GOwjAQBa2TkEAc_7ASdaRgw0HoAoFQpIRr0UKWYM54g-2A-HtScD3VPGnefImeVGoUzcZSdsXA-0scx_JnKicT1RPPvEizOeSGD2iigo9o4FfTA1KjK3slGwBtCQs8_lWOm3ZmdNDota3gxA521jc1ubv2VLZhSQwZX1FbSEusAwbNFh46nKFAV9G_zLH-Fp0TGk-DN_tiuF5tl5uodnxryIf9hRtnW7WXs0ROk0SNY_XZ6wW4hEy1</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2892799340</pqid></control><display><type>article</type><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><source>Free E- Journals</source><creator>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</creator><creatorcontrib>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</creatorcontrib><description>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics-&gt;BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics-&gt;BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adaptation ; Alignment ; Kinetics ; Representations</subject><ispartof>arXiv.org, 2023-11</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Lee, Hyogun</creatorcontrib><creatorcontrib>Bae, Kyungho</creatorcontrib><creatorcontrib>Seong Jong Ha</creatorcontrib><creatorcontrib>Ko, Yumin</creatorcontrib><creatorcontrib>Park, Gyeong-Moon</creatorcontrib><creatorcontrib>Choi, Jinwoo</creatorcontrib><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><title>arXiv.org</title><description>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics-&gt;BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics-&gt;BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</description><subject>Adaptation</subject><subject>Alignment</subject><subject>Kinetics</subject><subject>Representations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi7GOwjAQBa2TkEAc_7ASdaRgw0HoAoFQpIRr0UKWYM54g-2A-HtScD3VPGnefImeVGoUzcZSdsXA-0scx_JnKicT1RPPvEizOeSGD2iigo9o4FfTA1KjK3slGwBtCQs8_lWOm3ZmdNDota3gxA521jc1ubv2VLZhSQwZX1FbSEusAwbNFh46nKFAV9G_zLH-Fp0TGk-DN_tiuF5tl5uodnxryIf9hRtnW7WXs0ROk0SNY_XZ6wW4hEy1</recordid><startdate>20231122</startdate><enddate>20231122</enddate><creator>Lee, Hyogun</creator><creator>Bae, Kyungho</creator><creator>Seong Jong Ha</creator><creator>Ko, Yumin</creator><creator>Park, Gyeong-Moon</creator><creator>Choi, Jinwoo</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231122</creationdate><title>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</title><author>Lee, Hyogun ; Bae, Kyungho ; Seong Jong Ha ; Ko, Yumin ; Park, Gyeong-Moon ; Choi, Jinwoo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28927993403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Alignment</topic><topic>Kinetics</topic><topic>Representations</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, Hyogun</creatorcontrib><creatorcontrib>Bae, Kyungho</creatorcontrib><creatorcontrib>Seong Jong Ha</creatorcontrib><creatorcontrib>Ko, Yumin</creatorcontrib><creatorcontrib>Park, Gyeong-Moon</creatorcontrib><creatorcontrib>Choi, Jinwoo</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, Hyogun</au><au>Bae, Kyungho</au><au>Seong Jong Ha</au><au>Ko, Yumin</au><au>Park, Gyeong-Moon</au><au>Choi, Jinwoo</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap</atitle><jtitle>arXiv.org</jtitle><date>2023-11-22</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>In this work, we tackle the challenging problem of unsupervised video domain adaptation (UVDA) for action recognition. We specifically focus on scenarios with a substantial domain gap, in contrast to existing works primarily deal with small domain gaps between labeled source domains and unlabeled target domains. To establish a more realistic setting, we introduce a novel UVDA scenario, denoted as Kinetics-&gt;BABEL, with a more considerable domain gap in terms of both temporal dynamics and background shifts. To tackle the temporal shift, i.e., action duration difference between the source and target domains, we propose a global-local view alignment approach. To mitigate the background shift, we propose to learn temporal order sensitive representations by temporal order learning and background invariant representations by background augmentation. We empirically validate that the proposed method shows significant improvement over the existing methods on the Kinetics-&gt;BABEL dataset with a large domain gap. The code is available at https://github.com/KHUVLL/GLAD.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-11
issn 2331-8422
language eng
recordid cdi_proquest_journals_2892799340
source Free E- Journals
subjects Adaptation
Alignment
Kinetics
Representations
title GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T00%3A52%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=GLAD:%20Global-Local%20View%20Alignment%20and%20Background%20Debiasing%20for%20Unsupervised%20Video%20Domain%20Adaptation%20with%20Large%20Domain%20Gap&rft.jtitle=arXiv.org&rft.au=Lee,%20Hyogun&rft.date=2023-11-22&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2892799340%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2892799340&rft_id=info:pmid/&rfr_iscdi=true