Learning Multiple Sequence-Based Kernels for Video Concept Detection

Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these seq...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Bailer, W.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	feature combination Feature extraction fusion Histograms Kernel learning Multimedia communication Streaming media Vectors Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	77
container_issue
container_start_page	73
container_title
container_volume
creator	Bailer, W.
description	Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.
doi_str_mv	10.1109/ISM.2012.22
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6424634</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6424634</ieee_id><sourcerecordid>6424634</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-a75bd31040497f8b895a45524a69cbc7c7d21883b914c4f99aaa29b5b16351233</originalsourceid><addsrcrecordid>eNotjz1PwzAUAI0QErR0YmTxH0h5z5_xCCmFilQMBcRW2c4LMgpJSNKBf08lmG446aRj7AphiQjuZrPbLgWgWApxwmZgjdMqt_r9lM1QGSuVtGDO2WIcPwEAQWpw9oKtSvJDm9oPvj00U-ob4jv6PlAbKbvzI1X8iYaWmpHX3cDfUkUdL7qj7Se-oonilLr2kp3Vvhlp8c85e13fvxSPWfn8sCluyyyh1VPmrQ6VRFCgnK3zkDvtldZCeeNiiDbaSmCey-BQRVU7570XLuiARmoUUs7Z9V83EdG-H9KXH372Rglljn-_yEdI4Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning Multiple Sequence-Based Kernels for Video Concept Detection</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Bailer, W.</creator><creatorcontrib>Bailer, W.</creatorcontrib><description>Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.</description><identifier>ISBN: 1467343706</identifier><identifier>ISBN: 9781467343701</identifier><identifier>EISBN: 076954875X</identifier><identifier>EISBN: 9780769548753</identifier><identifier>DOI: 10.1109/ISM.2012.22</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>feature combination ; Feature extraction ; fusion ; Histograms ; Kernel ; learning ; Multimedia communication ; Streaming media ; Vectors ; Visualization</subject><ispartof>2012 IEEE International Symposium on Multimedia, 2012, p.73-77</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6424634$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6424634$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Bailer, W.</creatorcontrib><title>Learning Multiple Sequence-Based Kernels for Video Concept Detection</title><title>2012 IEEE International Symposium on Multimedia</title><addtitle>ism</addtitle><description>Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.</description><subject>feature combination</subject><subject>Feature extraction</subject><subject>fusion</subject><subject>Histograms</subject><subject>Kernel</subject><subject>learning</subject><subject>Multimedia communication</subject><subject>Streaming media</subject><subject>Vectors</subject><subject>Visualization</subject><isbn>1467343706</isbn><isbn>9781467343701</isbn><isbn>076954875X</isbn><isbn>9780769548753</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjz1PwzAUAI0QErR0YmTxH0h5z5_xCCmFilQMBcRW2c4LMgpJSNKBf08lmG446aRj7AphiQjuZrPbLgWgWApxwmZgjdMqt_r9lM1QGSuVtGDO2WIcPwEAQWpw9oKtSvJDm9oPvj00U-ob4jv6PlAbKbvzI1X8iYaWmpHX3cDfUkUdL7qj7Se-oonilLr2kp3Vvhlp8c85e13fvxSPWfn8sCluyyyh1VPmrQ6VRFCgnK3zkDvtldZCeeNiiDbaSmCey-BQRVU7570XLuiARmoUUs7Z9V83EdG-H9KXH372Rglljn-_yEdI4Q</recordid><startdate>201212</startdate><enddate>201212</enddate><creator>Bailer, W.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201212</creationdate><title>Learning Multiple Sequence-Based Kernels for Video Concept Detection</title><author>Bailer, W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-a75bd31040497f8b895a45524a69cbc7c7d21883b914c4f99aaa29b5b16351233</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>feature combination</topic><topic>Feature extraction</topic><topic>fusion</topic><topic>Histograms</topic><topic>Kernel</topic><topic>learning</topic><topic>Multimedia communication</topic><topic>Streaming media</topic><topic>Vectors</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Bailer, W.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bailer, W.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning Multiple Sequence-Based Kernels for Video Concept Detection</atitle><btitle>2012 IEEE International Symposium on Multimedia</btitle><stitle>ism</stitle><date>2012-12</date><risdate>2012</risdate><spage>73</spage><epage>77</epage><pages>73-77</pages><isbn>1467343706</isbn><isbn>9781467343701</isbn><eisbn>076954875X</eisbn><eisbn>9780769548753</eisbn><coden>IEEPAD</coden><abstract>Kernel based methods are widely applied to concept and event detection in video. Recently, kernels working on sequences of feature vectors of a video segment have been proposed for this problem, rather than treating feature vectors of individual frames independently. It has been shown that these sequence-based kernels (based e.g., on the dynamic time warping or edit distance paradigms) outperform methods working on single frames for concepts with inherently dynamic features. Existing work on sequence-based kernels either uses a single type of feature or a fixed combination of the feature vectors of each frame. However, different features (e.g., visual and audio features) may be sampled at different (possibly even irregular) rates, and the optimal alignment between the sequences of features may be different. Multiple kernel learning (MKL) has been applied to similarly structured problems, and we propose MKL for combining different sequence-based kernels on different features for video concept detection. We demonstrate the advantage of the proposed method with experiments on the TRECVID 2011 Semantic Indexing data set.</abstract><pub>IEEE</pub><doi>10.1109/ISM.2012.22</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISBN: 1467343706
ispartof	2012 IEEE International Symposium on Multimedia, 2012, p.73-77
issn
language	eng
recordid	cdi_ieee_primary_6424634
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	feature combination Feature extraction fusion Histograms Kernel learning Multimedia communication Streaming media Vectors Visualization
title	Learning Multiple Sequence-Based Kernels for Video Concept Detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T06%3A44%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20Multiple%20Sequence-Based%20Kernels%20for%20Video%20Concept%20Detection&rft.btitle=2012%20IEEE%20International%20Symposium%20on%20Multimedia&rft.au=Bailer,%20W.&rft.date=2012-12&rft.spage=73&rft.epage=77&rft.pages=73-77&rft.isbn=1467343706&rft.isbn_list=9781467343701&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ISM.2012.22&rft_dat=%3Cieee_6IE%3E6424634%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=076954875X&rft.eisbn_list=9780769548753&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6424634&rfr_iscdi=true