SGFNet: Structure-Guided Few-Shot Object Detection

Few-shot object detection (FSOD) focuses on detecting objects of novel classes with only a small number of annotated samples. Due to the limited number of new class samples and the presence of intra-class variance, current FSOD methods struggle to acquire sufficient discriminative information to rep...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems for video technology 2024-11, p.1-1
Hauptverfasser:	Ma, Jingkai, Bai, Shuang
Format:	Artikel
Sprache:	eng
Schlagworte:	Correlation Data mining Feature extraction Few shot learning Few-shot Filtering Frequency-domain analysis Interference Object detection saliency information soft cosine similarity structural information Training Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on circuits and systems for video technology
container_volume
creator	Ma, Jingkai Bai, Shuang
description	Few-shot object detection (FSOD) focuses on detecting objects of novel classes with only a small number of annotated samples. Due to the limited number of new class samples and the presence of intra-class variance, current FSOD methods struggle to acquire sufficient discriminative information to represent the corresponding class, thus restricting the performance of FSOD. To address this issue, we propose a Structure-Guided Few-shot object detection (SGFNet) method that utilizes the structural information of targets to provide richer discriminative information. Specifically, we first design a Multi-Frequency Structural Feature (MFSF) module, where the highly discriminative structural information of objects in images is extracted and used to enhance the discriminativeness of the features of the target. Based on the MFSF, we then propose a Saliency Information Enhancement (SIE) module that utilizes saliency information to enhance the object-related structural features while suppressing background interference. In addition, we present a novel Soft Cosine Classifier (SCC) based on soft cosine similarity to extract consistent discriminative information between the support and query features for distinguishing targets. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our method significantly outperforms a strong baseline (up to 13.8%) and previous state-of-the-art methods (4.8% in average).
doi_str_mv	10.1109/TCSVT.2024.3507863
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_10770244</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10770244</ieee_id><sourcerecordid>10_1109_TCSVT_2024_3507863</sourcerecordid><originalsourceid>FETCH-LOGICAL-c644-68308f46d12272e5a191e4096a38894c875b2670d2c556e73c09768251ff044d3</originalsourceid><addsrcrecordid>eNpNj81Kw0AUhQdRsFZfQFzkBSbeufMbdxJNFIpdJLgN6eQGU9TKZIL49qa2C1fnLM534GPsWkAqBGS3dV691ikCqlRqsM7IE7YQWjuOCPp07qAFdyj0ObsYxy2AUE7ZBcOqLF4o3iVVDJOPUyBeTkNHXVLQN6_edjFZb7bkY_JAcY5h93nJzvr2faSrYy5ZXTzW-RNfrcvn_H7FvVGKGyfB9cp0AtEi6VZkghRkppXOZco7qzdoLHTotTZkpYfMGoda9D0o1cklw8OtD7txDNQ3X2H4aMNPI6DZSzd_0s1eujlKz9DNARqI6B9g7bxS8hfhLVCV</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SGFNet: Structure-Guided Few-Shot Object Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Ma, Jingkai ; Bai, Shuang</creator><creatorcontrib>Ma, Jingkai ; Bai, Shuang</creatorcontrib><description>Few-shot object detection (FSOD) focuses on detecting objects of novel classes with only a small number of annotated samples. Due to the limited number of new class samples and the presence of intra-class variance, current FSOD methods struggle to acquire sufficient discriminative information to represent the corresponding class, thus restricting the performance of FSOD. To address this issue, we propose a Structure-Guided Few-shot object detection (SGFNet) method that utilizes the structural information of targets to provide richer discriminative information. Specifically, we first design a Multi-Frequency Structural Feature (MFSF) module, where the highly discriminative structural information of objects in images is extracted and used to enhance the discriminativeness of the features of the target. Based on the MFSF, we then propose a Saliency Information Enhancement (SIE) module that utilizes saliency information to enhance the object-related structural features while suppressing background interference. In addition, we present a novel Soft Cosine Classifier (SCC) based on soft cosine similarity to extract consistent discriminative information between the support and query features for distinguishing targets. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our method significantly outperforms a strong baseline (up to 13.8%) and previous state-of-the-art methods (4.8% in average).</description><identifier>ISSN: 1051-8215</identifier><identifier>EISSN: 1558-2205</identifier><identifier>DOI: 10.1109/TCSVT.2024.3507863</identifier><identifier>CODEN: ITCTEM</identifier><language>eng</language><publisher>IEEE</publisher><subject>Correlation ; Data mining ; Feature extraction ; Few shot learning ; Few-shot ; Filtering ; Frequency-domain analysis ; Interference ; Object detection ; saliency information ; soft cosine similarity ; structural information ; Training ; Visualization</subject><ispartof>IEEE transactions on circuits and systems for video technology, 2024-11, p.1-1</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-4586-8754 ; 0009-0005-0739-8239</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10770244$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10770244$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ma, Jingkai</creatorcontrib><creatorcontrib>Bai, Shuang</creatorcontrib><title>SGFNet: Structure-Guided Few-Shot Object Detection</title><title>IEEE transactions on circuits and systems for video technology</title><addtitle>TCSVT</addtitle><description>Few-shot object detection (FSOD) focuses on detecting objects of novel classes with only a small number of annotated samples. Due to the limited number of new class samples and the presence of intra-class variance, current FSOD methods struggle to acquire sufficient discriminative information to represent the corresponding class, thus restricting the performance of FSOD. To address this issue, we propose a Structure-Guided Few-shot object detection (SGFNet) method that utilizes the structural information of targets to provide richer discriminative information. Specifically, we first design a Multi-Frequency Structural Feature (MFSF) module, where the highly discriminative structural information of objects in images is extracted and used to enhance the discriminativeness of the features of the target. Based on the MFSF, we then propose a Saliency Information Enhancement (SIE) module that utilizes saliency information to enhance the object-related structural features while suppressing background interference. In addition, we present a novel Soft Cosine Classifier (SCC) based on soft cosine similarity to extract consistent discriminative information between the support and query features for distinguishing targets. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our method significantly outperforms a strong baseline (up to 13.8%) and previous state-of-the-art methods (4.8% in average).</description><subject>Correlation</subject><subject>Data mining</subject><subject>Feature extraction</subject><subject>Few shot learning</subject><subject>Few-shot</subject><subject>Filtering</subject><subject>Frequency-domain analysis</subject><subject>Interference</subject><subject>Object detection</subject><subject>saliency information</subject><subject>soft cosine similarity</subject><subject>structural information</subject><subject>Training</subject><subject>Visualization</subject><issn>1051-8215</issn><issn>1558-2205</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNj81Kw0AUhQdRsFZfQFzkBSbeufMbdxJNFIpdJLgN6eQGU9TKZIL49qa2C1fnLM534GPsWkAqBGS3dV691ikCqlRqsM7IE7YQWjuOCPp07qAFdyj0ObsYxy2AUE7ZBcOqLF4o3iVVDJOPUyBeTkNHXVLQN6_edjFZb7bkY_JAcY5h93nJzvr2faSrYy5ZXTzW-RNfrcvn_H7FvVGKGyfB9cp0AtEi6VZkghRkppXOZco7qzdoLHTotTZkpYfMGoda9D0o1cklw8OtD7txDNQ3X2H4aMNPI6DZSzd_0s1eujlKz9DNARqI6B9g7bxS8hfhLVCV</recordid><startdate>20241127</startdate><enddate>20241127</enddate><creator>Ma, Jingkai</creator><creator>Bai, Shuang</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-4586-8754</orcidid><orcidid>https://orcid.org/0009-0005-0739-8239</orcidid></search><sort><creationdate>20241127</creationdate><title>SGFNet: Structure-Guided Few-Shot Object Detection</title><author>Ma, Jingkai ; Bai, Shuang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c644-68308f46d12272e5a191e4096a38894c875b2670d2c556e73c09768251ff044d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Correlation</topic><topic>Data mining</topic><topic>Feature extraction</topic><topic>Few shot learning</topic><topic>Few-shot</topic><topic>Filtering</topic><topic>Frequency-domain analysis</topic><topic>Interference</topic><topic>Object detection</topic><topic>saliency information</topic><topic>soft cosine similarity</topic><topic>structural information</topic><topic>Training</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Jingkai</creatorcontrib><creatorcontrib>Bai, Shuang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on circuits and systems for video technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ma, Jingkai</au><au>Bai, Shuang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SGFNet: Structure-Guided Few-Shot Object Detection</atitle><jtitle>IEEE transactions on circuits and systems for video technology</jtitle><stitle>TCSVT</stitle><date>2024-11-27</date><risdate>2024</risdate><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1051-8215</issn><eissn>1558-2205</eissn><coden>ITCTEM</coden><abstract>Few-shot object detection (FSOD) focuses on detecting objects of novel classes with only a small number of annotated samples. Due to the limited number of new class samples and the presence of intra-class variance, current FSOD methods struggle to acquire sufficient discriminative information to represent the corresponding class, thus restricting the performance of FSOD. To address this issue, we propose a Structure-Guided Few-shot object detection (SGFNet) method that utilizes the structural information of targets to provide richer discriminative information. Specifically, we first design a Multi-Frequency Structural Feature (MFSF) module, where the highly discriminative structural information of objects in images is extracted and used to enhance the discriminativeness of the features of the target. Based on the MFSF, we then propose a Saliency Information Enhancement (SIE) module that utilizes saliency information to enhance the object-related structural features while suppressing background interference. In addition, we present a novel Soft Cosine Classifier (SCC) based on soft cosine similarity to extract consistent discriminative information between the support and query features for distinguishing targets. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our method significantly outperforms a strong baseline (up to 13.8%) and previous state-of-the-art methods (4.8% in average).</abstract><pub>IEEE</pub><doi>10.1109/TCSVT.2024.3507863</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-4586-8754</orcidid><orcidid>https://orcid.org/0009-0005-0739-8239</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1051-8215
ispartof	IEEE transactions on circuits and systems for video technology, 2024-11, p.1-1
issn	1051-8215 1558-2205
language	eng
recordid	cdi_ieee_primary_10770244
source	IEEE Electronic Library (IEL)
subjects	Correlation Data mining Feature extraction Few shot learning Few-shot Filtering Frequency-domain analysis Interference Object detection saliency information soft cosine similarity structural information Training Visualization
title	SGFNet: Structure-Guided Few-Shot Object Detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T17%3A11%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SGFNet:%20Structure-Guided%20Few-Shot%20Object%20Detection&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems%20for%20video%20technology&rft.au=Ma,%20Jingkai&rft.date=2024-11-27&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1051-8215&rft.eissn=1558-2205&rft.coden=ITCTEM&rft_id=info:doi/10.1109/TCSVT.2024.3507863&rft_dat=%3Ccrossref_RIE%3E10_1109_TCSVT_2024_3507863%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10770244&rfr_iscdi=true