Graph Based Temporal Aggregation for Video Retrieval

Large scale video retrieval is a field of study with a lot of ongoing research. Most of the work in the field is on video retrieval through text queries using techniques such as VSE++. However, there is little research done on video retrieval through image queries, and the work that has been done in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Srinivasan, Arvind, Bharadwaj, Aprameya, Saha, Aveek, Natarajan, Subramanyam
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Information Retrieval Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Srinivasan, Arvind Bharadwaj, Aprameya Saha, Aveek Natarajan, Subramanyam
description	Large scale video retrieval is a field of study with a lot of ongoing research. Most of the work in the field is on video retrieval through text queries using techniques such as VSE++. However, there is little research done on video retrieval through image queries, and the work that has been done in this field either uses image queries from within the video dataset or iterates through videos frame by frame. These approaches are not generalized for queries from outside the dataset and do not scale well for large video datasets. To overcome these issues, we propose a new approach for video retrieval through image queries where an undirected graph is constructed from the combined set of frames from all videos to be searched. The node features of this graph are used in the task of video retrieval. Experimentation is done on the MSR-VTT dataset by using query images from outside the dataset. To evaluate this novel approach P@5, P@10 and P@20 metrics are calculated. Two different ResNet models namely, ResNet-152 and ResNet-50 are used in this study.
doi_str_mv	10.48550/arxiv.2011.02426
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2011_02426</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2011_02426</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-ba19e0777568613d7360433c7bd80c07fbdf12c5d75908cc9366f336edd031793</originalsourceid><addsrcrecordid>eNotzsFKw0AQgOG9eJDqA3hyXyBxNpPdSY61aBUKBQlew2R3Ni6kJmxL0bcXq6f_9vMpdWegrBtr4YHzVzqXFRhTQlVX7lrV28zLh37kowTdyWGZM096PY5ZRj6l-VPHOev3FGTWb3LKSc483airyNNRbv-7Ut3zU7d5KXb77etmvSvYkSsGNq0AEVnXOIOB0EGN6GkIDXigOIRoKm8D2RYa71t0LiI6CQHQUIsrdf-3vbD7JacD5-_-l99f-PgD13o-Gw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Graph Based Temporal Aggregation for Video Retrieval</title><source>arXiv.org</source><creator>Srinivasan, Arvind ; Bharadwaj, Aprameya ; Saha, Aveek ; Natarajan, Subramanyam</creator><creatorcontrib>Srinivasan, Arvind ; Bharadwaj, Aprameya ; Saha, Aveek ; Natarajan, Subramanyam</creatorcontrib><description>Large scale video retrieval is a field of study with a lot of ongoing research. Most of the work in the field is on video retrieval through text queries using techniques such as VSE++. However, there is little research done on video retrieval through image queries, and the work that has been done in this field either uses image queries from within the video dataset or iterates through videos frame by frame. These approaches are not generalized for queries from outside the dataset and do not scale well for large video datasets. To overcome these issues, we propose a new approach for video retrieval through image queries where an undirected graph is constructed from the combined set of frames from all videos to be searched. The node features of this graph are used in the task of video retrieval. Experimentation is done on the MSR-VTT dataset by using query images from outside the dataset. To evaluate this novel approach P@5, P@10 and P@20 metrics are calculated. Two different ResNet models namely, ResNet-152 and ResNet-50 are used in this study.</description><identifier>DOI: 10.48550/arxiv.2011.02426</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Information Retrieval ; Computer Science - Learning</subject><creationdate>2020-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2011.02426$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2011.02426$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Srinivasan, Arvind</creatorcontrib><creatorcontrib>Bharadwaj, Aprameya</creatorcontrib><creatorcontrib>Saha, Aveek</creatorcontrib><creatorcontrib>Natarajan, Subramanyam</creatorcontrib><title>Graph Based Temporal Aggregation for Video Retrieval</title><description>Large scale video retrieval is a field of study with a lot of ongoing research. Most of the work in the field is on video retrieval through text queries using techniques such as VSE++. However, there is little research done on video retrieval through image queries, and the work that has been done in this field either uses image queries from within the video dataset or iterates through videos frame by frame. These approaches are not generalized for queries from outside the dataset and do not scale well for large video datasets. To overcome these issues, we propose a new approach for video retrieval through image queries where an undirected graph is constructed from the combined set of frames from all videos to be searched. The node features of this graph are used in the task of video retrieval. Experimentation is done on the MSR-VTT dataset by using query images from outside the dataset. To evaluate this novel approach P@5, P@10 and P@20 metrics are calculated. Two different ResNet models namely, ResNet-152 and ResNet-50 are used in this study.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Information Retrieval</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzsFKw0AQgOG9eJDqA3hyXyBxNpPdSY61aBUKBQlew2R3Ni6kJmxL0bcXq6f_9vMpdWegrBtr4YHzVzqXFRhTQlVX7lrV28zLh37kowTdyWGZM096PY5ZRj6l-VPHOev3FGTWb3LKSc483airyNNRbv-7Ut3zU7d5KXb77etmvSvYkSsGNq0AEVnXOIOB0EGN6GkIDXigOIRoKm8D2RYa71t0LiI6CQHQUIsrdf-3vbD7JacD5-_-l99f-PgD13o-Gw</recordid><startdate>20201104</startdate><enddate>20201104</enddate><creator>Srinivasan, Arvind</creator><creator>Bharadwaj, Aprameya</creator><creator>Saha, Aveek</creator><creator>Natarajan, Subramanyam</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201104</creationdate><title>Graph Based Temporal Aggregation for Video Retrieval</title><author>Srinivasan, Arvind ; Bharadwaj, Aprameya ; Saha, Aveek ; Natarajan, Subramanyam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-ba19e0777568613d7360433c7bd80c07fbdf12c5d75908cc9366f336edd031793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Information Retrieval</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Srinivasan, Arvind</creatorcontrib><creatorcontrib>Bharadwaj, Aprameya</creatorcontrib><creatorcontrib>Saha, Aveek</creatorcontrib><creatorcontrib>Natarajan, Subramanyam</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Srinivasan, Arvind</au><au>Bharadwaj, Aprameya</au><au>Saha, Aveek</au><au>Natarajan, Subramanyam</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Graph Based Temporal Aggregation for Video Retrieval</atitle><date>2020-11-04</date><risdate>2020</risdate><abstract>Large scale video retrieval is a field of study with a lot of ongoing research. Most of the work in the field is on video retrieval through text queries using techniques such as VSE++. However, there is little research done on video retrieval through image queries, and the work that has been done in this field either uses image queries from within the video dataset or iterates through videos frame by frame. These approaches are not generalized for queries from outside the dataset and do not scale well for large video datasets. To overcome these issues, we propose a new approach for video retrieval through image queries where an undirected graph is constructed from the combined set of frames from all videos to be searched. The node features of this graph are used in the task of video retrieval. Experimentation is done on the MSR-VTT dataset by using query images from outside the dataset. To evaluate this novel approach P@5, P@10 and P@20 metrics are calculated. Two different ResNet models namely, ResNet-152 and ResNet-50 are used in this study.</abstract><doi>10.48550/arxiv.2011.02426</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2011.02426
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2011_02426
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Information Retrieval Computer Science - Learning
title	Graph Based Temporal Aggregation for Video Retrieval
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T09%3A13%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Graph%20Based%20Temporal%20Aggregation%20for%20Video%20Retrieval&rft.au=Srinivasan,%20Arvind&rft.date=2020-11-04&rft_id=info:doi/10.48550/arxiv.2011.02426&rft_dat=%3Carxiv_GOX%3E2011_02426%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true