Event-based Tracking of Any Point with Motion-Robust Correlation Features

Tracking any point (TAP) recently shifted the motion estimation paradigm from focusing on individual salient points with local templates to tracking arbitrary points with global image contexts. However, while research has mostly focused on driving the accuracy of models in nominal settings, addressi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hamann, Friedhelm, Gehrig, Daniel, Febryanto, Filbert, Daniilidis, Kostas, Gallego, Guillermo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Hamann, Friedhelm
Gehrig, Daniel
Febryanto, Filbert
Daniilidis, Kostas
Gallego, Guillermo
description Tracking any point (TAP) recently shifted the motion estimation paradigm from focusing on individual salient points with local templates to tracking arbitrary points with global image contexts. However, while research has mostly focused on driving the accuracy of models in nominal settings, addressing scenarios with difficult lighting conditions and high-speed motions remains out of reach due to the limitations of the sensor. This work addresses this challenge with the first event camera-based TAP method. It leverages the high temporal resolution and high dynamic range of event cameras for robust high-speed tracking, and the global contexts in TAP methods to handle asynchronous and sparse event measurements. We further extend the TAP framework to handle event feature variations induced by motion - thereby addressing an open challenge in purely event-based tracking - with a novel feature alignment loss which ensures the learning of motion-robust features. Our method is trained with data from a new data generation pipeline and systematically ablated across all design decisions. Our method shows strong cross-dataset generalization and performs 135% better on the average Jaccard metric than the baselines. Moreover, on an established feature tracking benchmark, it achieves a 19% improvement over the previous best event-only method and even surpasses the previous best events-and-frames method by 3.7%.
doi_str_mv 10.48550/arxiv.2412.00133
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_00133</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_00133</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_001333</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwMDQ25mTwdC1LzSvRTUosTk1RCClKTM7OzEtXyE9TcMyrVAjIz8wrUSjPLMlQ8M0vyczP0w3KTyotLlFwzi8qSs1JBAkpuKUmlpQWpRbzMLCmJeYUp_JCaW4GeTfXEGcPXbCl8QVFmbmJRZXxIMvjwZYbE1YBAP7oObQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Event-based Tracking of Any Point with Motion-Robust Correlation Features</title><source>arXiv.org</source><creator>Hamann, Friedhelm ; Gehrig, Daniel ; Febryanto, Filbert ; Daniilidis, Kostas ; Gallego, Guillermo</creator><creatorcontrib>Hamann, Friedhelm ; Gehrig, Daniel ; Febryanto, Filbert ; Daniilidis, Kostas ; Gallego, Guillermo</creatorcontrib><description>Tracking any point (TAP) recently shifted the motion estimation paradigm from focusing on individual salient points with local templates to tracking arbitrary points with global image contexts. However, while research has mostly focused on driving the accuracy of models in nominal settings, addressing scenarios with difficult lighting conditions and high-speed motions remains out of reach due to the limitations of the sensor. This work addresses this challenge with the first event camera-based TAP method. It leverages the high temporal resolution and high dynamic range of event cameras for robust high-speed tracking, and the global contexts in TAP methods to handle asynchronous and sparse event measurements. We further extend the TAP framework to handle event feature variations induced by motion - thereby addressing an open challenge in purely event-based tracking - with a novel feature alignment loss which ensures the learning of motion-robust features. Our method is trained with data from a new data generation pipeline and systematically ablated across all design decisions. Our method shows strong cross-dataset generalization and performs 135% better on the average Jaccard metric than the baselines. Moreover, on an established feature tracking benchmark, it achieves a 19% improvement over the previous best event-only method and even surpasses the previous best events-and-frames method by 3.7%.</description><identifier>DOI: 10.48550/arxiv.2412.00133</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Computer Science - Robotics</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.00133$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.00133$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hamann, Friedhelm</creatorcontrib><creatorcontrib>Gehrig, Daniel</creatorcontrib><creatorcontrib>Febryanto, Filbert</creatorcontrib><creatorcontrib>Daniilidis, Kostas</creatorcontrib><creatorcontrib>Gallego, Guillermo</creatorcontrib><title>Event-based Tracking of Any Point with Motion-Robust Correlation Features</title><description>Tracking any point (TAP) recently shifted the motion estimation paradigm from focusing on individual salient points with local templates to tracking arbitrary points with global image contexts. However, while research has mostly focused on driving the accuracy of models in nominal settings, addressing scenarios with difficult lighting conditions and high-speed motions remains out of reach due to the limitations of the sensor. This work addresses this challenge with the first event camera-based TAP method. It leverages the high temporal resolution and high dynamic range of event cameras for robust high-speed tracking, and the global contexts in TAP methods to handle asynchronous and sparse event measurements. We further extend the TAP framework to handle event feature variations induced by motion - thereby addressing an open challenge in purely event-based tracking - with a novel feature alignment loss which ensures the learning of motion-robust features. Our method is trained with data from a new data generation pipeline and systematically ablated across all design decisions. Our method shows strong cross-dataset generalization and performs 135% better on the average Jaccard metric than the baselines. Moreover, on an established feature tracking benchmark, it achieves a 19% improvement over the previous best event-only method and even surpasses the previous best events-and-frames method by 3.7%.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jMwMDQ25mTwdC1LzSvRTUosTk1RCClKTM7OzEtXyE9TcMyrVAjIz8wrUSjPLMlQ8M0vyczP0w3KTyotLlFwzi8qSs1JBAkpuKUmlpQWpRbzMLCmJeYUp_JCaW4GeTfXEGcPXbCl8QVFmbmJRZXxIMvjwZYbE1YBAP7oObQ</recordid><startdate>20241128</startdate><enddate>20241128</enddate><creator>Hamann, Friedhelm</creator><creator>Gehrig, Daniel</creator><creator>Febryanto, Filbert</creator><creator>Daniilidis, Kostas</creator><creator>Gallego, Guillermo</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241128</creationdate><title>Event-based Tracking of Any Point with Motion-Robust Correlation Features</title><author>Hamann, Friedhelm ; Gehrig, Daniel ; Febryanto, Filbert ; Daniilidis, Kostas ; Gallego, Guillermo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_001333</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Hamann, Friedhelm</creatorcontrib><creatorcontrib>Gehrig, Daniel</creatorcontrib><creatorcontrib>Febryanto, Filbert</creatorcontrib><creatorcontrib>Daniilidis, Kostas</creatorcontrib><creatorcontrib>Gallego, Guillermo</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hamann, Friedhelm</au><au>Gehrig, Daniel</au><au>Febryanto, Filbert</au><au>Daniilidis, Kostas</au><au>Gallego, Guillermo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Event-based Tracking of Any Point with Motion-Robust Correlation Features</atitle><date>2024-11-28</date><risdate>2024</risdate><abstract>Tracking any point (TAP) recently shifted the motion estimation paradigm from focusing on individual salient points with local templates to tracking arbitrary points with global image contexts. However, while research has mostly focused on driving the accuracy of models in nominal settings, addressing scenarios with difficult lighting conditions and high-speed motions remains out of reach due to the limitations of the sensor. This work addresses this challenge with the first event camera-based TAP method. It leverages the high temporal resolution and high dynamic range of event cameras for robust high-speed tracking, and the global contexts in TAP methods to handle asynchronous and sparse event measurements. We further extend the TAP framework to handle event feature variations induced by motion - thereby addressing an open challenge in purely event-based tracking - with a novel feature alignment loss which ensures the learning of motion-robust features. Our method is trained with data from a new data generation pipeline and systematically ablated across all design decisions. Our method shows strong cross-dataset generalization and performs 135% better on the average Jaccard metric than the baselines. Moreover, on an established feature tracking benchmark, it achieves a 19% improvement over the previous best event-only method and even surpasses the previous best events-and-frames method by 3.7%.</abstract><doi>10.48550/arxiv.2412.00133</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2412.00133
ispartof
issn
language eng
recordid cdi_arxiv_primary_2412_00133
source arXiv.org
subjects Computer Science - Computer Vision and Pattern Recognition
Computer Science - Learning
Computer Science - Robotics
title Event-based Tracking of Any Point with Motion-Robust Correlation Features
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T05%3A52%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Event-based%20Tracking%20of%20Any%20Point%20with%20Motion-Robust%20Correlation%20Features&rft.au=Hamann,%20Friedhelm&rft.date=2024-11-28&rft_id=info:doi/10.48550/arxiv.2412.00133&rft_dat=%3Carxiv_GOX%3E2412_00133%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true