Hierarchical Context Modeling for Video Event Recognition
Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these chall...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2017-09, Vol.39 (9), p.1770-1782 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1782 |
---|---|
container_issue | 9 |
container_start_page | 1770 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 39 |
creator | Wang, Xiaoyang Ji, Qiang |
description | Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance. |
doi_str_mv | 10.1109/TPAMI.2016.2616308 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_28113742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7588132</ieee_id><sourcerecordid>1861613030</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</originalsourceid><addsrcrecordid>eNpdkE1P3DAQhq2qqCzb_oFWqiL1wiVbjyf-OqIVBSQQqKK9Wokz3hplY-pkEfx7suzCAc1hDvO8o5mHsa_AFwDc_ry9Obm6WAgOaiEUKOTmA5uBRVuiRPuRzaaJKI0R5pAdDcMd51BJjp_YoTAAqCsxY_Y8Uq6z_xd93RXL1I_0OBZXqaUu9qsipFz8jS2l4vSB-rH4TT6t-jjG1H9mB6HuBvqy73P259fp7fK8vLw-u1ieXJYeJYwl8sZjxQkxaGykAt4EbbymFltRgddKtSRUMFI3RK0JujHBeqttFYSSAefseLf3Pqf_GxpGt46Dp66re0qbwYGZfgfkU83Zj3foXdrkfrrOgRUaQUqlJkrsKJ_TMGQK7j7HdZ2fHHC3FetexLqtWLcXO4W-71dvmjW1b5FXkxPwbQdEInoba2kMoMBnFZB64A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1927315566</pqid></control><display><type>article</type><title>Hierarchical Context Modeling for Video Event Recognition</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Xiaoyang ; Ji, Qiang</creator><creatorcontrib>Wang, Xiaoyang ; Ji, Qiang</creatorcontrib><description>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2016.2616308</identifier><identifier>PMID: 28113742</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Context ; Context modeling ; event recognition ; Hidden Markov models ; Hierarchical context model ; image context ; Image detection ; Image recognition ; Image resolution ; Object recognition ; Priming ; priming context ; Representations ; semantic context ; Semantics ; Surveillance ; Target recognition</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2017-09, Vol.39 (9), p.1770-1782</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</citedby><cites>FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7588132$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,782,786,798,27931,27932,54765</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7588132$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28113742$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Xiaoyang</creatorcontrib><creatorcontrib>Ji, Qiang</creatorcontrib><title>Hierarchical Context Modeling for Video Event Recognition</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</description><subject>Context</subject><subject>Context modeling</subject><subject>event recognition</subject><subject>Hidden Markov models</subject><subject>Hierarchical context model</subject><subject>image context</subject><subject>Image detection</subject><subject>Image recognition</subject><subject>Image resolution</subject><subject>Object recognition</subject><subject>Priming</subject><subject>priming context</subject><subject>Representations</subject><subject>semantic context</subject><subject>Semantics</subject><subject>Surveillance</subject><subject>Target recognition</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1P3DAQhq2qqCzb_oFWqiL1wiVbjyf-OqIVBSQQqKK9Wokz3hplY-pkEfx7suzCAc1hDvO8o5mHsa_AFwDc_ry9Obm6WAgOaiEUKOTmA5uBRVuiRPuRzaaJKI0R5pAdDcMd51BJjp_YoTAAqCsxY_Y8Uq6z_xd93RXL1I_0OBZXqaUu9qsipFz8jS2l4vSB-rH4TT6t-jjG1H9mB6HuBvqy73P259fp7fK8vLw-u1ieXJYeJYwl8sZjxQkxaGykAt4EbbymFltRgddKtSRUMFI3RK0JujHBeqttFYSSAefseLf3Pqf_GxpGt46Dp66re0qbwYGZfgfkU83Zj3foXdrkfrrOgRUaQUqlJkrsKJ_TMGQK7j7HdZ2fHHC3FetexLqtWLcXO4W-71dvmjW1b5FXkxPwbQdEInoba2kMoMBnFZB64A</recordid><startdate>20170901</startdate><enddate>20170901</enddate><creator>Wang, Xiaoyang</creator><creator>Ji, Qiang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20170901</creationdate><title>Hierarchical Context Modeling for Video Event Recognition</title><author>Wang, Xiaoyang ; Ji, Qiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-30bc340e33f73b5610bf78c7ed3d241c766de26f857beed8f7b8f9c9794f265f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Context</topic><topic>Context modeling</topic><topic>event recognition</topic><topic>Hidden Markov models</topic><topic>Hierarchical context model</topic><topic>image context</topic><topic>Image detection</topic><topic>Image recognition</topic><topic>Image resolution</topic><topic>Object recognition</topic><topic>Priming</topic><topic>priming context</topic><topic>Representations</topic><topic>semantic context</topic><topic>Semantics</topic><topic>Surveillance</topic><topic>Target recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Xiaoyang</creatorcontrib><creatorcontrib>Ji, Qiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Xiaoyang</au><au>Ji, Qiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hierarchical Context Modeling for Video Event Recognition</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2017-09-01</date><risdate>2017</risdate><volume>39</volume><issue>9</issue><spage>1770</spage><epage>1782</epage><pages>1770-1782</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Current video event recognition research remains largely target-centered. For real-world surveillance videos, target-centered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>28113742</pmid><doi>10.1109/TPAMI.2016.2616308</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2017-09, Vol.39 (9), p.1770-1782 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_pubmed_primary_28113742 |
source | IEEE Electronic Library (IEL) |
subjects | Context Context modeling event recognition Hidden Markov models Hierarchical context model image context Image detection Image recognition Image resolution Object recognition Priming priming context Representations semantic context Semantics Surveillance Target recognition |
title | Hierarchical Context Modeling for Video Event Recognition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-05T02%3A37%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hierarchical%20Context%20Modeling%20for%20Video%20Event%20Recognition&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Wang,%20Xiaoyang&rft.date=2017-09-01&rft.volume=39&rft.issue=9&rft.spage=1770&rft.epage=1782&rft.pages=1770-1782&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2016.2616308&rft_dat=%3Cproquest_RIE%3E1861613030%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1927315566&rft_id=info:pmid/28113742&rft_ieee_id=7588132&rfr_iscdi=true |