Hierarchical Context Modeling for Video Event Recognition

Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these chall...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2017-09, Vol.39 (9), p.1770-1782
Hauptverfasser: Wang, Xiaoyang, Ji, Qiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1782
container_issue 9
container_start_page 1770
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 39
creator Wang, Xiaoyang
Ji, Qiang
description Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.
doi_str_mv 10.1109/TPAMI.2016.2616308
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_28113742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7588132</ieee_id><sourcerecordid>1861613030</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</originalsourceid><addsrcrecordid>eNpdkE1P3DAQhq2qqCzb_oFWqiL1wiVbjyf-OqIVBSQQqKK9Wokz3hplY-pkEfx7suzCAc1hDvO8o5mHsa_AFwDc_ry9Obm6WAgOaiEUKOTmA5uBRVuiRPuRzaaJKI0R5pAdDcMd51BJjp_YoTAAqCsxY_Y8Uq6z_xd93RXL1I_0OBZXqaUu9qsipFz8jS2l4vSB-rH4TT6t-jjG1H9mB6HuBvqy73P259fp7fK8vLw-u1ieXJYeJYwl8sZjxQkxaGykAt4EbbymFltRgddKtSRUMFI3RK0JujHBeqttFYSSAefseLf3Pqf_GxpGt46Dp66re0qbwYGZfgfkU83Zj3foXdrkfrrOgRUaQUqlJkrsKJ_TMGQK7j7HdZ2fHHC3FetexLqtWLcXO4W-71dvmjW1b5FXkxPwbQdEInoba2kMoMBnFZB64A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1927315566</pqid></control><display><type>article</type><title>Hierarchical Context Modeling for Video Event Recognition</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Xiaoyang ; Ji, Qiang</creator><creatorcontrib>Wang, Xiaoyang ; Ji, Qiang</creatorcontrib><description>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2016.2616308</identifier><identifier>PMID: 28113742</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Context ; Context modeling ; event recognition ; Hidden Markov models ; Hierarchical context model ; image context ; Image detection ; Image recognition ; Image resolution ; Object recognition ; Priming ; priming context ; Representations ; semantic context ; Semantics ; Surveillance ; Target recognition</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2017-09, Vol.39 (9), p.1770-1782</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</citedby><cites>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7588132$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,782,786,798,27931,27932,54765</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7588132$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28113742$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Xiaoyang</creatorcontrib><creatorcontrib>Ji, Qiang</creatorcontrib><title>Hierarchical Context Modeling for Video Event Recognition</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</description><subject>Context</subject><subject>Context modeling</subject><subject>event recognition</subject><subject>Hidden Markov models</subject><subject>Hierarchical context model</subject><subject>image context</subject><subject>Image detection</subject><subject>Image recognition</subject><subject>Image resolution</subject><subject>Object recognition</subject><subject>Priming</subject><subject>priming context</subject><subject>Representations</subject><subject>semantic context</subject><subject>Semantics</subject><subject>Surveillance</subject><subject>Target recognition</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1P3DAQhq2qqCzb_oFWqiL1wiVbjyf-OqIVBSQQqKK9Wokz3hplY-pkEfx7suzCAc1hDvO8o5mHsa_AFwDc_ry9Obm6WAgOaiEUKOTmA5uBRVuiRPuRzaaJKI0R5pAdDcMd51BJjp_YoTAAqCsxY_Y8Uq6z_xd93RXL1I_0OBZXqaUu9qsipFz8jS2l4vSB-rH4TT6t-jjG1H9mB6HuBvqy73P259fp7fK8vLw-u1ieXJYeJYwl8sZjxQkxaGykAt4EbbymFltRgddKtSRUMFI3RK0JujHBeqttFYSSAefseLf3Pqf_GxpGt46Dp66re0qbwYGZfgfkU83Zj3foXdrkfrrOgRUaQUqlJkrsKJ_TMGQK7j7HdZ2fHHC3FetexLqtWLcXO4W-71dvmjW1b5FXkxPwbQdEInoba2kMoMBnFZB64A</recordid><startdate>20170901</startdate><enddate>20170901</enddate><creator>Wang, Xiaoyang</creator><creator>Ji, Qiang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20170901</creationdate><title>Hierarchical Context Modeling for Video Event Recognition</title><author>Wang, Xiaoyang ; Ji, Qiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Context</topic><topic>Context modeling</topic><topic>event recognition</topic><topic>Hidden Markov models</topic><topic>Hierarchical context model</topic><topic>image context</topic><topic>Image detection</topic><topic>Image recognition</topic><topic>Image resolution</topic><topic>Object recognition</topic><topic>Priming</topic><topic>priming context</topic><topic>Representations</topic><topic>semantic context</topic><topic>Semantics</topic><topic>Surveillance</topic><topic>Target recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Xiaoyang</creatorcontrib><creatorcontrib>Ji, Qiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Xiaoyang</au><au>Ji, Qiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Context Modeling for Video Event Recognition</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2017-09-01</date><risdate>2017</risdate><volume>39</volume><issue>9</issue><spage>1770</spage><epage>1782</epage><pages>1770-1782</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>28113742</pmid><doi>10.1109/TPAMI.2016.2616308</doi><tpages>13</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2017-09, Vol.39 (9), p.1770-1782
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_pubmed_primary_28113742
source IEEE Electronic Library (IEL)
subjects Context
Context modeling
event recognition
Hidden Markov models
Hierarchical context model
image context
Image detection
Image recognition
Image resolution
Object recognition
Priming
priming context
Representations
semantic context
Semantics
Surveillance
Target recognition
title Hierarchical Context Modeling for Video Event Recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-05T02%3A37%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Context%20Modeling%20for%20Video%20Event%20Recognition&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Wang,%20Xiaoyang&rft.date=2017-09-01&rft.volume=39&rft.issue=9&rft.spage=1770&rft.epage=1782&rft.pages=1770-1782&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2016.2616308&rft_dat=%3Cproquest_RIE%3E1861613030%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1927315566&rft_id=info:pmid/28113742&rft_ieee_id=7588132&rfr_iscdi=true