Crowd Counting via Unsupervised Cross-Domain Feature Adaptation

Given an image, crowd counting aims to estimate the amount of target objects in the image. With un-predictable installation situations of surveillance systems (or other equipments), crowd counting images from different data sets may exhibit severe discrepancies in viewing angle, scale, lighting cond...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2023, Vol.25, p.4665-4678
Hauptverfasser: Ding, Guanchen, Yang, Daiqin, Wang, Tao, Wang, Sihan, Zhang, Yunfei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4678
container_issue
container_start_page 4665
container_title IEEE transactions on multimedia
container_volume 25
creator Ding, Guanchen
Yang, Daiqin
Wang, Tao
Wang, Sihan
Zhang, Yunfei
description Given an image, crowd counting aims to estimate the amount of target objects in the image. With un-predictable installation situations of surveillance systems (or other equipments), crowd counting images from different data sets may exhibit severe discrepancies in viewing angle, scale, lighting condition, etc. As it is usually expensive and time-consuming to annotate each data set for model training, it has been an essential issue in crowd counting to transfer a well-trained model on a labeled data set (source domain) to a new data set (target domain). To tackle this problem, we propose a cross-domain learning network to learn the domain gaps in an unsupervised learning manner. The proposed network comprises of a Multi-granularity Feature-aware Discriminator (MFD) module, a Domain-invariant Feature Adaptation (DFA) module, and a Cross-domain Vanishing Bridge (CVB) module to remove domain-specific information from the extracted features and promote the mapping performances of the network. Unlike most existing methods that use only Global Feature Discriminator (GFD) to align features at image level, an additional Local Feature Discriminator (LFD) is inserted and together with GFD form the MFD module. As a complement to MFD, LFD refines features at pixel level and has the ability to align local features. The DFA module explicitly measures the distances between the source domain features and the target domain features and aligns the marginal distribution of their features with Maximum Mean Discrepancy (MMD). Finally, the CVB module provides an incremental capability of removing the impact of interfering part of the extracted features. Several well-known networks are adopted as the backbone of our algorithm to prove the effectiveness of the proposed adaptation structure. Comprehensive experiments demonstrate that our model achieves competitive performance to the state-of-the-art methods.
doi_str_mv 10.1109/TMM.2022.3180222
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TMM_2022_3180222</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9788041</ieee_id><sourcerecordid>2884893469</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-35103d8683bb7154e546e407f6d58b248923419686f23b523ed9f8061a34567a3</originalsourceid><addsrcrecordid>eNo9kM1Lw0AQxRdRsH7cBS8Bz6kz-70nKdGq0OKlPS-bZiMpNom7ScX_3i0tnt7AvDfz-BFyhzBFBPO4Wi6nFCidMtRJ6BmZoOGYAyh1nmZBITcU4ZJcxbgFQC5ATchTEbqfKiu6sR2a9jPbNy5bt3Hsfdg30adN6GLMn7uda9ps7t0wBp_NKtcPbmi69oZc1O4r-tuTXpP1_GVVvOWLj9f3YrbIN9TgkDOBwCotNStLhYJ7waXnoGpZCV1Srg1lHI3UsqasFJT5ytQaJDrGhVSOXZOH490-dN-jj4PddmNo00tLtU55xqVJLji6NofWwde2D83OhV-LYA-YbMJkD5jsCVOK3B8jjff-326U1sCR_QE462D3</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2884893469</pqid></control><display><type>article</type><title>Crowd Counting via Unsupervised Cross-Domain Feature Adaptation</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Ding, Guanchen ; Yang, Daiqin ; Wang, Tao ; Wang, Sihan ; Zhang, Yunfei</creator><creatorcontrib>Ding, Guanchen ; Yang, Daiqin ; Wang, Tao ; Wang, Sihan ; Zhang, Yunfei</creatorcontrib><description>Given an image, crowd counting aims to estimate the amount of target objects in the image. With un-predictable installation situations of surveillance systems (or other equipments), crowd counting images from different data sets may exhibit severe discrepancies in viewing angle, scale, lighting condition, etc. As it is usually expensive and time-consuming to annotate each data set for model training, it has been an essential issue in crowd counting to transfer a well-trained model on a labeled data set (source domain) to a new data set (target domain). To tackle this problem, we propose a cross-domain learning network to learn the domain gaps in an unsupervised learning manner. The proposed network comprises of a Multi-granularity Feature-aware Discriminator (MFD) module, a Domain-invariant Feature Adaptation (DFA) module, and a Cross-domain Vanishing Bridge (CVB) module to remove domain-specific information from the extracted features and promote the mapping performances of the network. Unlike most existing methods that use only Global Feature Discriminator (GFD) to align features at image level, an additional Local Feature Discriminator (LFD) is inserted and together with GFD form the MFD module. As a complement to MFD, LFD refines features at pixel level and has the ability to align local features. The DFA module explicitly measures the distances between the source domain features and the target domain features and aligns the marginal distribution of their features with Maximum Mean Discrepancy (MMD). Finally, the CVB module provides an incremental capability of removing the impact of interfering part of the extracted features. Several well-known networks are adopted as the backbone of our algorithm to prove the effectiveness of the proposed adaptation structure. Comprehensive experiments demonstrate that our model achieves competitive performance to the state-of-the-art methods.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2022.3180222</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation ; Adaptation models ; adversarial learning ; Algorithms ; Annotations ; Bridges ; Crowd monitoring ; Data models ; Datasets ; density map estimation ; Discriminators ; domain adaptation ; Feature extraction ; Modules ; Semantics ; Surveillance systems ; Training ; Unsupervised crowd counting ; Unsupervised learning</subject><ispartof>IEEE transactions on multimedia, 2023, Vol.25, p.4665-4678</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-35103d8683bb7154e546e407f6d58b248923419686f23b523ed9f8061a34567a3</citedby><cites>FETCH-LOGICAL-c291t-35103d8683bb7154e546e407f6d58b248923419686f23b523ed9f8061a34567a3</cites><orcidid>0000-0002-3983-6142 ; 0000-0001-8160-7995 ; 0000-0002-9523-1850</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9788041$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9788041$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ding, Guanchen</creatorcontrib><creatorcontrib>Yang, Daiqin</creatorcontrib><creatorcontrib>Wang, Tao</creatorcontrib><creatorcontrib>Wang, Sihan</creatorcontrib><creatorcontrib>Zhang, Yunfei</creatorcontrib><title>Crowd Counting via Unsupervised Cross-Domain Feature Adaptation</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Given an image, crowd counting aims to estimate the amount of target objects in the image. With un-predictable installation situations of surveillance systems (or other equipments), crowd counting images from different data sets may exhibit severe discrepancies in viewing angle, scale, lighting condition, etc. As it is usually expensive and time-consuming to annotate each data set for model training, it has been an essential issue in crowd counting to transfer a well-trained model on a labeled data set (source domain) to a new data set (target domain). To tackle this problem, we propose a cross-domain learning network to learn the domain gaps in an unsupervised learning manner. The proposed network comprises of a Multi-granularity Feature-aware Discriminator (MFD) module, a Domain-invariant Feature Adaptation (DFA) module, and a Cross-domain Vanishing Bridge (CVB) module to remove domain-specific information from the extracted features and promote the mapping performances of the network. Unlike most existing methods that use only Global Feature Discriminator (GFD) to align features at image level, an additional Local Feature Discriminator (LFD) is inserted and together with GFD form the MFD module. As a complement to MFD, LFD refines features at pixel level and has the ability to align local features. The DFA module explicitly measures the distances between the source domain features and the target domain features and aligns the marginal distribution of their features with Maximum Mean Discrepancy (MMD). Finally, the CVB module provides an incremental capability of removing the impact of interfering part of the extracted features. Several well-known networks are adopted as the backbone of our algorithm to prove the effectiveness of the proposed adaptation structure. Comprehensive experiments demonstrate that our model achieves competitive performance to the state-of-the-art methods.</description><subject>Adaptation</subject><subject>Adaptation models</subject><subject>adversarial learning</subject><subject>Algorithms</subject><subject>Annotations</subject><subject>Bridges</subject><subject>Crowd monitoring</subject><subject>Data models</subject><subject>Datasets</subject><subject>density map estimation</subject><subject>Discriminators</subject><subject>domain adaptation</subject><subject>Feature extraction</subject><subject>Modules</subject><subject>Semantics</subject><subject>Surveillance systems</subject><subject>Training</subject><subject>Unsupervised crowd counting</subject><subject>Unsupervised learning</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1Lw0AQxRdRsH7cBS8Bz6kz-70nKdGq0OKlPS-bZiMpNom7ScX_3i0tnt7AvDfz-BFyhzBFBPO4Wi6nFCidMtRJ6BmZoOGYAyh1nmZBITcU4ZJcxbgFQC5ATchTEbqfKiu6sR2a9jPbNy5bt3Hsfdg30adN6GLMn7uda9ps7t0wBp_NKtcPbmi69oZc1O4r-tuTXpP1_GVVvOWLj9f3YrbIN9TgkDOBwCotNStLhYJ7waXnoGpZCV1Srg1lHI3UsqasFJT5ytQaJDrGhVSOXZOH490-dN-jj4PddmNo00tLtU55xqVJLji6NofWwde2D83OhV-LYA-YbMJkD5jsCVOK3B8jjff-326U1sCR_QE462D3</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Ding, Guanchen</creator><creator>Yang, Daiqin</creator><creator>Wang, Tao</creator><creator>Wang, Sihan</creator><creator>Zhang, Yunfei</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3983-6142</orcidid><orcidid>https://orcid.org/0000-0001-8160-7995</orcidid><orcidid>https://orcid.org/0000-0002-9523-1850</orcidid></search><sort><creationdate>2023</creationdate><title>Crowd Counting via Unsupervised Cross-Domain Feature Adaptation</title><author>Ding, Guanchen ; Yang, Daiqin ; Wang, Tao ; Wang, Sihan ; Zhang, Yunfei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-35103d8683bb7154e546e407f6d58b248923419686f23b523ed9f8061a34567a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Adaptation models</topic><topic>adversarial learning</topic><topic>Algorithms</topic><topic>Annotations</topic><topic>Bridges</topic><topic>Crowd monitoring</topic><topic>Data models</topic><topic>Datasets</topic><topic>density map estimation</topic><topic>Discriminators</topic><topic>domain adaptation</topic><topic>Feature extraction</topic><topic>Modules</topic><topic>Semantics</topic><topic>Surveillance systems</topic><topic>Training</topic><topic>Unsupervised crowd counting</topic><topic>Unsupervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ding, Guanchen</creatorcontrib><creatorcontrib>Yang, Daiqin</creatorcontrib><creatorcontrib>Wang, Tao</creatorcontrib><creatorcontrib>Wang, Sihan</creatorcontrib><creatorcontrib>Zhang, Yunfei</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ding, Guanchen</au><au>Yang, Daiqin</au><au>Wang, Tao</au><au>Wang, Sihan</au><au>Zhang, Yunfei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Crowd Counting via Unsupervised Cross-Domain Feature Adaptation</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2023</date><risdate>2023</risdate><volume>25</volume><spage>4665</spage><epage>4678</epage><pages>4665-4678</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Given an image, crowd counting aims to estimate the amount of target objects in the image. With un-predictable installation situations of surveillance systems (or other equipments), crowd counting images from different data sets may exhibit severe discrepancies in viewing angle, scale, lighting condition, etc. As it is usually expensive and time-consuming to annotate each data set for model training, it has been an essential issue in crowd counting to transfer a well-trained model on a labeled data set (source domain) to a new data set (target domain). To tackle this problem, we propose a cross-domain learning network to learn the domain gaps in an unsupervised learning manner. The proposed network comprises of a Multi-granularity Feature-aware Discriminator (MFD) module, a Domain-invariant Feature Adaptation (DFA) module, and a Cross-domain Vanishing Bridge (CVB) module to remove domain-specific information from the extracted features and promote the mapping performances of the network. Unlike most existing methods that use only Global Feature Discriminator (GFD) to align features at image level, an additional Local Feature Discriminator (LFD) is inserted and together with GFD form the MFD module. As a complement to MFD, LFD refines features at pixel level and has the ability to align local features. The DFA module explicitly measures the distances between the source domain features and the target domain features and aligns the marginal distribution of their features with Maximum Mean Discrepancy (MMD). Finally, the CVB module provides an incremental capability of removing the impact of interfering part of the extracted features. Several well-known networks are adopted as the backbone of our algorithm to prove the effectiveness of the proposed adaptation structure. Comprehensive experiments demonstrate that our model achieves competitive performance to the state-of-the-art methods.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2022.3180222</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-3983-6142</orcidid><orcidid>https://orcid.org/0000-0001-8160-7995</orcidid><orcidid>https://orcid.org/0000-0002-9523-1850</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-9210
ispartof IEEE transactions on multimedia, 2023, Vol.25, p.4665-4678
issn 1520-9210
1941-0077
language eng
recordid cdi_crossref_primary_10_1109_TMM_2022_3180222
source IEEE/IET Electronic Library (IEL)
subjects Adaptation
Adaptation models
adversarial learning
Algorithms
Annotations
Bridges
Crowd monitoring
Data models
Datasets
density map estimation
Discriminators
domain adaptation
Feature extraction
Modules
Semantics
Surveillance systems
Training
Unsupervised crowd counting
Unsupervised learning
title Crowd Counting via Unsupervised Cross-Domain Feature Adaptation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T08%3A28%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Crowd%20Counting%20via%20Unsupervised%20Cross-Domain%20Feature%20Adaptation&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Ding,%20Guanchen&rft.date=2023&rft.volume=25&rft.spage=4665&rft.epage=4678&rft.pages=4665-4678&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2022.3180222&rft_dat=%3Cproquest_RIE%3E2884893469%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2884893469&rft_id=info:pmid/&rft_ieee_id=9788041&rfr_iscdi=true