A Scalable Approach to Activity Recognition based on Object Use

We propose an approach to activity recognition based on detecting and analyzing the sequence of objects that are being manipulated by the user. In domains such as cooking, where many activities involve similar actions, object-use information can be a valuable cue. In order for this approach to scale...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Jianxin Wu, Osuntogun, A., Choudhury, T., Philipose, M., Rehg, J.M.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Bayesian methods Character recognition Data mining Educational institutions Humans Information resources Object detection Object recognition Radiofrequency identification RFID tags
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	8
container_issue
container_start_page	1
container_title
container_volume
creator	Jianxin Wu Osuntogun, A. Choudhury, T. Philipose, M. Rehg, J.M.
description	We propose an approach to activity recognition based on detecting and analyzing the sequence of objects that are being manipulated by the user. In domains such as cooking, where many activities involve similar actions, object-use information can be a valuable cue. In order for this approach to scale to many activities and objects, however, it is necessary to minimize the amount of human-labeled data that is required for modeling. We describe a method for automatically acquiring object models from video without any explicit human supervision. Our approach leverages sparse and noisy readings from RFID tagged objects, along with common-sense knowledge about which objects are likely to be used during a given activity, to bootstrap the learning process. We present a dynamic Bayesian network model which combines RFID and video data to jointly infer the most likely activity and object labels. We demonstrate that our approach can achieve activity recognition rates of more than 80% on a real-world dataset consisting of 16 household activities involving 33 objects with significant background clutter. We show that the combination of visual object recognition with RFID data is significantly more effective than the RFID sensor alone. Our work demonstrates that it is possible to automatically learn object models from video of household activities and employ these models for activity recognition, without requiring any explicit human labeling.
doi_str_mv	10.1109/ICCV.2007.4408865
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4408865</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4408865</ieee_id><sourcerecordid>4408865</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-c791c130a9cbf12122d5d29268dc8e89291b853727e887f85889af4466ef78723</originalsourceid><addsrcrecordid>eNpFkM1Kw0AURsc_MK0-gLiZF0i9985M5s5KQvCnUCiodVsmk4lOiU1pgtC3V7Dg6jtw4Cw-IW4QZojg7uZV9T4jADvTGpgLcyImqElrLBTCqchIMeTWgD77F0DnIkNjIDfauUsxGYYNgHLERSbuS_kafOfrLspyt9v3PnzKsZdlGNN3Gg_yJYb-Y5vG1G9l7YfYyF9Y1psYRrka4pW4aH03xOvjTsXq8eGtes4Xy6d5VS7yRBrHPFiHARV4F-oWCYka05CjgpvAkR05rNkoSzYy25YNs_Ot1kURW8uW1FTc_nVTjHG926cvvz-sjy-oH02BSnE</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A Scalable Approach to Activity Recognition based on Object Use</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Jianxin Wu ; Osuntogun, A. ; Choudhury, T. ; Philipose, M. ; Rehg, J.M.</creator><creatorcontrib>Jianxin Wu ; Osuntogun, A. ; Choudhury, T. ; Philipose, M. ; Rehg, J.M.</creatorcontrib><description>We propose an approach to activity recognition based on detecting and analyzing the sequence of objects that are being manipulated by the user. In domains such as cooking, where many activities involve similar actions, object-use information can be a valuable cue. In order for this approach to scale to many activities and objects, however, it is necessary to minimize the amount of human-labeled data that is required for modeling. We describe a method for automatically acquiring object models from video without any explicit human supervision. Our approach leverages sparse and noisy readings from RFID tagged objects, along with common-sense knowledge about which objects are likely to be used during a given activity, to bootstrap the learning process. We present a dynamic Bayesian network model which combines RFID and video data to jointly infer the most likely activity and object labels. We demonstrate that our approach can achieve activity recognition rates of more than 80% on a real-world dataset consisting of 16 household activities involving 33 objects with significant background clutter. We show that the combination of visual object recognition with RFID data is significantly more effective than the RFID sensor alone. Our work demonstrates that it is possible to automatically learn object models from video of household activities and employ these models for activity recognition, without requiring any explicit human labeling.</description><identifier>ISSN: 1550-5499</identifier><identifier>ISBN: 1424416302</identifier><identifier>ISBN: 9781424416301</identifier><identifier>EISSN: 2380-7504</identifier><identifier>EISBN: 1424416310</identifier><identifier>EISBN: 9781424416318</identifier><identifier>DOI: 10.1109/ICCV.2007.4408865</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bayesian methods ; Character recognition ; Data mining ; Educational institutions ; Humans ; Information resources ; Object detection ; Object recognition ; Radiofrequency identification ; RFID tags</subject><ispartof>2007 IEEE 11th International Conference on Computer Vision, 2007, p.1-8</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4408865$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4408865$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jianxin Wu</creatorcontrib><creatorcontrib>Osuntogun, A.</creatorcontrib><creatorcontrib>Choudhury, T.</creatorcontrib><creatorcontrib>Philipose, M.</creatorcontrib><creatorcontrib>Rehg, J.M.</creatorcontrib><title>A Scalable Approach to Activity Recognition based on Object Use</title><title>2007 IEEE 11th International Conference on Computer Vision</title><addtitle>ICCV</addtitle><description>We propose an approach to activity recognition based on detecting and analyzing the sequence of objects that are being manipulated by the user. In domains such as cooking, where many activities involve similar actions, object-use information can be a valuable cue. In order for this approach to scale to many activities and objects, however, it is necessary to minimize the amount of human-labeled data that is required for modeling. We describe a method for automatically acquiring object models from video without any explicit human supervision. Our approach leverages sparse and noisy readings from RFID tagged objects, along with common-sense knowledge about which objects are likely to be used during a given activity, to bootstrap the learning process. We present a dynamic Bayesian network model which combines RFID and video data to jointly infer the most likely activity and object labels. We demonstrate that our approach can achieve activity recognition rates of more than 80% on a real-world dataset consisting of 16 household activities involving 33 objects with significant background clutter. We show that the combination of visual object recognition with RFID data is significantly more effective than the RFID sensor alone. Our work demonstrates that it is possible to automatically learn object models from video of household activities and employ these models for activity recognition, without requiring any explicit human labeling.</description><subject>Bayesian methods</subject><subject>Character recognition</subject><subject>Data mining</subject><subject>Educational institutions</subject><subject>Humans</subject><subject>Information resources</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Radiofrequency identification</subject><subject>RFID tags</subject><issn>1550-5499</issn><issn>2380-7504</issn><isbn>1424416302</isbn><isbn>9781424416301</isbn><isbn>1424416310</isbn><isbn>9781424416318</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFkM1Kw0AURsc_MK0-gLiZF0i9985M5s5KQvCnUCiodVsmk4lOiU1pgtC3V7Dg6jtw4Cw-IW4QZojg7uZV9T4jADvTGpgLcyImqElrLBTCqchIMeTWgD77F0DnIkNjIDfauUsxGYYNgHLERSbuS_kafOfrLspyt9v3PnzKsZdlGNN3Gg_yJYb-Y5vG1G9l7YfYyF9Y1psYRrka4pW4aH03xOvjTsXq8eGtes4Xy6d5VS7yRBrHPFiHARV4F-oWCYka05CjgpvAkR05rNkoSzYy25YNs_Ot1kURW8uW1FTc_nVTjHG926cvvz-sjy-oH02BSnE</recordid><startdate>20070101</startdate><enddate>20070101</enddate><creator>Jianxin Wu</creator><creator>Osuntogun, A.</creator><creator>Choudhury, T.</creator><creator>Philipose, M.</creator><creator>Rehg, J.M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20070101</creationdate><title>A Scalable Approach to Activity Recognition based on Object Use</title><author>Jianxin Wu ; Osuntogun, A. ; Choudhury, T. ; Philipose, M. ; Rehg, J.M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-c791c130a9cbf12122d5d29268dc8e89291b853727e887f85889af4466ef78723</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Bayesian methods</topic><topic>Character recognition</topic><topic>Data mining</topic><topic>Educational institutions</topic><topic>Humans</topic><topic>Information resources</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Radiofrequency identification</topic><topic>RFID tags</topic><toplevel>online_resources</toplevel><creatorcontrib>Jianxin Wu</creatorcontrib><creatorcontrib>Osuntogun, A.</creatorcontrib><creatorcontrib>Choudhury, T.</creatorcontrib><creatorcontrib>Philipose, M.</creatorcontrib><creatorcontrib>Rehg, J.M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jianxin Wu</au><au>Osuntogun, A.</au><au>Choudhury, T.</au><au>Philipose, M.</au><au>Rehg, J.M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A Scalable Approach to Activity Recognition based on Object Use</atitle><btitle>2007 IEEE 11th International Conference on Computer Vision</btitle><stitle>ICCV</stitle><date>2007-01-01</date><risdate>2007</risdate><spage>1</spage><epage>8</epage><pages>1-8</pages><issn>1550-5499</issn><eissn>2380-7504</eissn><isbn>1424416302</isbn><isbn>9781424416301</isbn><eisbn>1424416310</eisbn><eisbn>9781424416318</eisbn><abstract>We propose an approach to activity recognition based on detecting and analyzing the sequence of objects that are being manipulated by the user. In domains such as cooking, where many activities involve similar actions, object-use information can be a valuable cue. In order for this approach to scale to many activities and objects, however, it is necessary to minimize the amount of human-labeled data that is required for modeling. We describe a method for automatically acquiring object models from video without any explicit human supervision. Our approach leverages sparse and noisy readings from RFID tagged objects, along with common-sense knowledge about which objects are likely to be used during a given activity, to bootstrap the learning process. We present a dynamic Bayesian network model which combines RFID and video data to jointly infer the most likely activity and object labels. We demonstrate that our approach can achieve activity recognition rates of more than 80% on a real-world dataset consisting of 16 household activities involving 33 objects with significant background clutter. We show that the combination of visual object recognition with RFID data is significantly more effective than the RFID sensor alone. Our work demonstrates that it is possible to automatically learn object models from video of household activities and employ these models for activity recognition, without requiring any explicit human labeling.</abstract><pub>IEEE</pub><doi>10.1109/ICCV.2007.4408865</doi><tpages>8</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1550-5499
ispartof	2007 IEEE 11th International Conference on Computer Vision, 2007, p.1-8
issn	1550-5499 2380-7504
language	eng
recordid	cdi_ieee_primary_4408865
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Bayesian methods Character recognition Data mining Educational institutions Humans Information resources Object detection Object recognition Radiofrequency identification RFID tags
title	A Scalable Approach to Activity Recognition based on Object Use
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T17%3A04%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20Scalable%20Approach%20to%20Activity%20Recognition%20based%20on%20Object%20Use&rft.btitle=2007%20IEEE%2011th%20International%20Conference%20on%20Computer%20Vision&rft.au=Jianxin%20Wu&rft.date=2007-01-01&rft.spage=1&rft.epage=8&rft.pages=1-8&rft.issn=1550-5499&rft.eissn=2380-7504&rft.isbn=1424416302&rft.isbn_list=9781424416301&rft_id=info:doi/10.1109/ICCV.2007.4408865&rft_dat=%3Cieee_6IE%3E4408865%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1424416310&rft.eisbn_list=9781424416318&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4408865&rfr_iscdi=true