FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement
The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention (SA) mechanism overlook crucial high-frequency information (such as texture, edges, and colors) th...
Gespeichert in:
Veröffentlicht in: | IEEE sensors journal 2025-01, Vol.25 (2), p.3879-3897 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3897 |
---|---|
container_issue | 2 |
container_start_page | 3879 |
container_title | IEEE sensors journal |
container_volume | 25 |
creator | Gao, Zhao Zhou, Dongming Cao, Jinde Liu, Yisong Shan, Qingqing |
description | The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention (SA) mechanism overlook crucial high-frequency information (such as texture, edges, and colors) that is essential for object prediction. To address these challenges, we propose a frequency-enhanced dual-head focus tracking network (FDTrack) for RGB-T tracking. FDTrack comprises four main components: high-frequency feature enhancement (HFFE), wavelet multifrequency (WMF) interaction, autonomous modality prediction (AMP), and search focus preprocessing (SFP). HFFE refines the features from the ViT backbones within specific modalities by adaptively amplifying high-frequency features. In contrast, WMF facilitates communication between different frequency bands to enhance the interaction of RGB-T features from a frequency perspective. To improve tracking robustness under extreme scenes, AMP incorporates dual prediction heads and determines the final outcome through feature matching. SFP adjusts the convolution kernel size based on pixel-to-target distance and preprocesses the search region with Gaussian blur to reduce background clutter interference and emphasize the target. Extensive experimental results demonstrate that FDTrack achieves competitive performance compared to state-of-the-art algorithms across various datasets, including RGBT210, RGBT234, and LasHeR, showcasing its cutting-edge capabilities in this field. |
doi_str_mv | 10.1109/JSEN.2024.3506929 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3155820531</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10778233</ieee_id><sourcerecordid>3155820531</sourcerecordid><originalsourceid>FETCH-LOGICAL-c911-8e6fdea6255122cf5a64d08842edaccd6501ff09c34475a13067cf8cc9730c213</originalsourceid><addsrcrecordid>eNpNkMtOwzAQRS0EEqXwAUgsLLFO8TO22VVtQ0FVWVAJdpY1sWn6SIqTqOrf09AuWM1Ic-_MnYPQPSUDSol5evuYzAeMMDHgkqSGmQvUo1LqhCqhL7uek0Rw9XWNbup6RQg1SqoeyrLxIjpYP-MhHrduk0y9y3FWQVvjv0FRfuO5b_ZVXOPPolniLPqf1pdwwJNy6UrwW182t-gquE3t7861jxbZZDGaJrP3l9fRcJaAoTTRPg25dymTkjIGQbpU5ERrwXzuAPJUEhoCMcCFUNJRTlIFQQMYxQkwyvvo8bR2F6tjiLqxq6qN5fGi5d23jEjeqehJBbGq6-iD3cVi6-LBUmI7WrajZTta9kzr6Hk4eQrv_T-9Uppxzn8BBfJkVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3155820531</pqid></control><display><type>article</type><title>FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement</title><source>IEEE Electronic Library (IEL)</source><creator>Gao, Zhao ; Zhou, Dongming ; Cao, Jinde ; Liu, Yisong ; Shan, Qingqing</creator><creatorcontrib>Gao, Zhao ; Zhou, Dongming ; Cao, Jinde ; Liu, Yisong ; Shan, Qingqing</creatorcontrib><description>The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention (SA) mechanism overlook crucial high-frequency information (such as texture, edges, and colors) that is essential for object prediction. To address these challenges, we propose a frequency-enhanced dual-head focus tracking network (FDTrack) for RGB-T tracking. FDTrack comprises four main components: high-frequency feature enhancement (HFFE), wavelet multifrequency (WMF) interaction, autonomous modality prediction (AMP), and search focus preprocessing (SFP). HFFE refines the features from the ViT backbones within specific modalities by adaptively amplifying high-frequency features. In contrast, WMF facilitates communication between different frequency bands to enhance the interaction of RGB-T features from a frequency perspective. To improve tracking robustness under extreme scenes, AMP incorporates dual prediction heads and determines the final outcome through feature matching. SFP adjusts the convolution kernel size based on pixel-to-target distance and preprocesses the search region with Gaussian blur to reduce background clutter interference and emphasize the target. Extensive experimental results demonstrate that FDTrack achieves competitive performance compared to state-of-the-art algorithms across various datasets, including RGBT210, RGBT234, and LasHeR, showcasing its cutting-edge capabilities in this field.</description><identifier>ISSN: 1530-437X</identifier><identifier>EISSN: 1558-1748</identifier><identifier>DOI: 10.1109/JSEN.2024.3506929</identifier><identifier>CODEN: ISJEAZ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Algorithms ; Clutter ; Feature extraction ; Frequencies ; Frequency enhanced ; Head ; Image processing ; Ions ; Iron ; Magnetic heads ; multifrequency interaction ; Object tracking ; Real-time systems ; RGB-T tracking ; Target tracking ; Tracking ; Tracking networks ; Transformers ; visual focusing</subject><ispartof>IEEE sensors journal, 2025-01, Vol.25 (2), p.3879-3897</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c911-8e6fdea6255122cf5a64d08842edaccd6501ff09c34475a13067cf8cc9730c213</cites><orcidid>0009-0007-8294-9912 ; 0000-0001-9825-5914 ; 0009-0002-6810-7878 ; 0000-0003-0139-9415 ; 0000-0003-3133-7119</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10778233$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10778233$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gao, Zhao</creatorcontrib><creatorcontrib>Zhou, Dongming</creatorcontrib><creatorcontrib>Cao, Jinde</creatorcontrib><creatorcontrib>Liu, Yisong</creatorcontrib><creatorcontrib>Shan, Qingqing</creatorcontrib><title>FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement</title><title>IEEE sensors journal</title><addtitle>JSEN</addtitle><description>The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention (SA) mechanism overlook crucial high-frequency information (such as texture, edges, and colors) that is essential for object prediction. To address these challenges, we propose a frequency-enhanced dual-head focus tracking network (FDTrack) for RGB-T tracking. FDTrack comprises four main components: high-frequency feature enhancement (HFFE), wavelet multifrequency (WMF) interaction, autonomous modality prediction (AMP), and search focus preprocessing (SFP). HFFE refines the features from the ViT backbones within specific modalities by adaptively amplifying high-frequency features. In contrast, WMF facilitates communication between different frequency bands to enhance the interaction of RGB-T features from a frequency perspective. To improve tracking robustness under extreme scenes, AMP incorporates dual prediction heads and determines the final outcome through feature matching. SFP adjusts the convolution kernel size based on pixel-to-target distance and preprocesses the search region with Gaussian blur to reduce background clutter interference and emphasize the target. Extensive experimental results demonstrate that FDTrack achieves competitive performance compared to state-of-the-art algorithms across various datasets, including RGBT210, RGBT234, and LasHeR, showcasing its cutting-edge capabilities in this field.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Clutter</subject><subject>Feature extraction</subject><subject>Frequencies</subject><subject>Frequency enhanced</subject><subject>Head</subject><subject>Image processing</subject><subject>Ions</subject><subject>Iron</subject><subject>Magnetic heads</subject><subject>multifrequency interaction</subject><subject>Object tracking</subject><subject>Real-time systems</subject><subject>RGB-T tracking</subject><subject>Target tracking</subject><subject>Tracking</subject><subject>Tracking networks</subject><subject>Transformers</subject><subject>visual focusing</subject><issn>1530-437X</issn><issn>1558-1748</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtOwzAQRS0EEqXwAUgsLLFO8TO22VVtQ0FVWVAJdpY1sWn6SIqTqOrf09AuWM1Ic-_MnYPQPSUDSol5evuYzAeMMDHgkqSGmQvUo1LqhCqhL7uek0Rw9XWNbup6RQg1SqoeyrLxIjpYP-MhHrduk0y9y3FWQVvjv0FRfuO5b_ZVXOPPolniLPqf1pdwwJNy6UrwW182t-gquE3t7861jxbZZDGaJrP3l9fRcJaAoTTRPg25dymTkjIGQbpU5ERrwXzuAPJUEhoCMcCFUNJRTlIFQQMYxQkwyvvo8bR2F6tjiLqxq6qN5fGi5d23jEjeqehJBbGq6-iD3cVi6-LBUmI7WrajZTta9kzr6Hk4eQrv_T-9Uppxzn8BBfJkVQ</recordid><startdate>20250115</startdate><enddate>20250115</enddate><creator>Gao, Zhao</creator><creator>Zhou, Dongming</creator><creator>Cao, Jinde</creator><creator>Liu, Yisong</creator><creator>Shan, Qingqing</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0007-8294-9912</orcidid><orcidid>https://orcid.org/0000-0001-9825-5914</orcidid><orcidid>https://orcid.org/0009-0002-6810-7878</orcidid><orcidid>https://orcid.org/0000-0003-0139-9415</orcidid><orcidid>https://orcid.org/0000-0003-3133-7119</orcidid></search><sort><creationdate>20250115</creationdate><title>FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement</title><author>Gao, Zhao ; Zhou, Dongming ; Cao, Jinde ; Liu, Yisong ; Shan, Qingqing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c911-8e6fdea6255122cf5a64d08842edaccd6501ff09c34475a13067cf8cc9730c213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Clutter</topic><topic>Feature extraction</topic><topic>Frequencies</topic><topic>Frequency enhanced</topic><topic>Head</topic><topic>Image processing</topic><topic>Ions</topic><topic>Iron</topic><topic>Magnetic heads</topic><topic>multifrequency interaction</topic><topic>Object tracking</topic><topic>Real-time systems</topic><topic>RGB-T tracking</topic><topic>Target tracking</topic><topic>Tracking</topic><topic>Tracking networks</topic><topic>Transformers</topic><topic>visual focusing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gao, Zhao</creatorcontrib><creatorcontrib>Zhou, Dongming</creatorcontrib><creatorcontrib>Cao, Jinde</creatorcontrib><creatorcontrib>Liu, Yisong</creatorcontrib><creatorcontrib>Shan, Qingqing</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE sensors journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gao, Zhao</au><au>Zhou, Dongming</au><au>Cao, Jinde</au><au>Liu, Yisong</au><au>Shan, Qingqing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement</atitle><jtitle>IEEE sensors journal</jtitle><stitle>JSEN</stitle><date>2025-01-15</date><risdate>2025</risdate><volume>25</volume><issue>2</issue><spage>3879</spage><epage>3897</epage><pages>3879-3897</pages><issn>1530-437X</issn><eissn>1558-1748</eissn><coden>ISJEAZ</coden><abstract>The RGB-T tracking approach combines the advantages of visible and thermal sensors to achieve accurate target tracking in complex scenarios. However, previous RGB-T trackers based on the self-attention (SA) mechanism overlook crucial high-frequency information (such as texture, edges, and colors) that is essential for object prediction. To address these challenges, we propose a frequency-enhanced dual-head focus tracking network (FDTrack) for RGB-T tracking. FDTrack comprises four main components: high-frequency feature enhancement (HFFE), wavelet multifrequency (WMF) interaction, autonomous modality prediction (AMP), and search focus preprocessing (SFP). HFFE refines the features from the ViT backbones within specific modalities by adaptively amplifying high-frequency features. In contrast, WMF facilitates communication between different frequency bands to enhance the interaction of RGB-T features from a frequency perspective. To improve tracking robustness under extreme scenes, AMP incorporates dual prediction heads and determines the final outcome through feature matching. SFP adjusts the convolution kernel size based on pixel-to-target distance and preprocesses the search region with Gaussian blur to reduce background clutter interference and emphasize the target. Extensive experimental results demonstrate that FDTrack achieves competitive performance compared to state-of-the-art algorithms across various datasets, including RGBT210, RGBT234, and LasHeR, showcasing its cutting-edge capabilities in this field.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSEN.2024.3506929</doi><tpages>19</tpages><orcidid>https://orcid.org/0009-0007-8294-9912</orcidid><orcidid>https://orcid.org/0000-0001-9825-5914</orcidid><orcidid>https://orcid.org/0009-0002-6810-7878</orcidid><orcidid>https://orcid.org/0000-0003-0139-9415</orcidid><orcidid>https://orcid.org/0000-0003-3133-7119</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1530-437X |
ispartof | IEEE sensors journal, 2025-01, Vol.25 (2), p.3879-3897 |
issn | 1530-437X 1558-1748 |
language | eng |
recordid | cdi_proquest_journals_3155820531 |
source | IEEE Electronic Library (IEL) |
subjects | Accuracy Algorithms Clutter Feature extraction Frequencies Frequency enhanced Head Image processing Ions Iron Magnetic heads multifrequency interaction Object tracking Real-time systems RGB-T tracking Target tracking Tracking Tracking networks Transformers visual focusing |
title | FDTrack: A Dual-Head Focus Tracking Network With Frequency Enhancement |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T07%3A52%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=FDTrack:%20A%20Dual-Head%20Focus%20Tracking%20Network%20With%20Frequency%20Enhancement&rft.jtitle=IEEE%20sensors%20journal&rft.au=Gao,%20Zhao&rft.date=2025-01-15&rft.volume=25&rft.issue=2&rft.spage=3879&rft.epage=3897&rft.pages=3879-3897&rft.issn=1530-437X&rft.eissn=1558-1748&rft.coden=ISJEAZ&rft_id=info:doi/10.1109/JSEN.2024.3506929&rft_dat=%3Cproquest_RIE%3E3155820531%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3155820531&rft_id=info:pmid/&rft_ieee_id=10778233&rfr_iscdi=true |