Learning hierarchical representation with sparsity for RGB-D object recognition

RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kuan-Ting Yu, Shih-Huan Tseng, Li-Chen Fu
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Accuracy Dictionaries Encoding Feature extraction Filter banks Histograms Shape
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3016
container_issue
container_start_page	3011
container_title
container_volume
creator	Kuan-Ting Yu Shih-Huan Tseng Li-Chen Fu
description	RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation is a promising approach for 2D image classification. By exploiting sparsity in 2D image signals, we can learn image representation instead of using hand-crafted local descriptors like SIFT or HoG. This framework inspired us to learn features from RGB-D data. Our work focuses on two goals. First, we propose a novel Hierarchical Sparse Shape Descriptor (HSSD) to form learning-based representation for 3D shapes. To achieve this, we analyze several 3D feature extraction techniques and propose a unified view of them. Then, we learn hierarchical shape representation with sparse coding, max pooling and local grouping. Second, we investigate whether RGB and depth information should be fused at lower level or higher level. Experimental results show that, first, our HSSD algorithm can learn shape dictionary and provide shape cues in addition to the 2D cues. Using the proposed HSSD algorithm achieves 84% accuracy on a household RGB-D object dataset and outperforms a widely used VFH shape feature by 13%. Second, fusing RGB-D information at lower level does not improve recognition performance.
doi_str_mv	10.1109/IROS.2012.6386175
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6386175</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6386175</ieee_id><sourcerecordid>6386175</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-ae8a9d0e02bd9c485070977eddee8d4b5e88dc282f16fce5c57bc1239fc13a403</originalsourceid><addsrcrecordid>eNo9kF1LwzAYheMXOOd-gHiTP9CaN2m-LnXTORgMpl6PNH27Zsy2pAHZv3fi9OpcPIeHwyHkDlgOwOzDYr16yzkDnithFGh5RiZWGyiUFqCFhHMy4iBFxoxSF-TmD2hx-Q-kuSaTYdgxxo5OJcCOyGqJLrah3dImYHTRN8G7PY3YRxywTS6FrqVfITV06F0cQjrQuot0PX_KZrQrd-jTse27bRt-qrfkqnb7ASenHJOPl-f36Wu2XM0X08dlFo7bU-bQOFsxZLysrC-MZJpZrbGqEE1VlBKNqTw3vAZVe5Re6tIDF7b2IFzBxJjc_3oDIm76GD5dPGxO34hvHwlU0g</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning hierarchical representation with sparsity for RGB-D object recognition</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kuan-Ting Yu ; Shih-Huan Tseng ; Li-Chen Fu</creator><creatorcontrib>Kuan-Ting Yu ; Shih-Huan Tseng ; Li-Chen Fu</creatorcontrib><description>RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation is a promising approach for 2D image classification. By exploiting sparsity in 2D image signals, we can learn image representation instead of using hand-crafted local descriptors like SIFT or HoG. This framework inspired us to learn features from RGB-D data. Our work focuses on two goals. First, we propose a novel Hierarchical Sparse Shape Descriptor (HSSD) to form learning-based representation for 3D shapes. To achieve this, we analyze several 3D feature extraction techniques and propose a unified view of them. Then, we learn hierarchical shape representation with sparse coding, max pooling and local grouping. Second, we investigate whether RGB and depth information should be fused at lower level or higher level. Experimental results show that, first, our HSSD algorithm can learn shape dictionary and provide shape cues in addition to the 2D cues. Using the proposed HSSD algorithm achieves 84% accuracy on a household RGB-D object dataset and outperforms a widely used VFH shape feature by 13%. Second, fusing RGB-D information at lower level does not improve recognition performance.</description><identifier>ISSN: 2153-0858</identifier><identifier>ISBN: 1467317373</identifier><identifier>ISBN: 9781467317375</identifier><identifier>EISSN: 2153-0866</identifier><identifier>EISBN: 9781467317351</identifier><identifier>EISBN: 1467317365</identifier><identifier>EISBN: 9781467317368</identifier><identifier>EISBN: 1467317357</identifier><identifier>DOI: 10.1109/IROS.2012.6386175</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Dictionaries ; Encoding ; Feature extraction ; Filter banks ; Histograms ; Shape</subject><ispartof>2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, p.3011-3016</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6386175$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,778,782,787,788,2054,27908,54903</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6386175$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kuan-Ting Yu</creatorcontrib><creatorcontrib>Shih-Huan Tseng</creatorcontrib><creatorcontrib>Li-Chen Fu</creatorcontrib><title>Learning hierarchical representation with sparsity for RGB-D object recognition</title><title>2012 IEEE/RSJ International Conference on Intelligent Robots and Systems</title><addtitle>IROS</addtitle><description>RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation is a promising approach for 2D image classification. By exploiting sparsity in 2D image signals, we can learn image representation instead of using hand-crafted local descriptors like SIFT or HoG. This framework inspired us to learn features from RGB-D data. Our work focuses on two goals. First, we propose a novel Hierarchical Sparse Shape Descriptor (HSSD) to form learning-based representation for 3D shapes. To achieve this, we analyze several 3D feature extraction techniques and propose a unified view of them. Then, we learn hierarchical shape representation with sparse coding, max pooling and local grouping. Second, we investigate whether RGB and depth information should be fused at lower level or higher level. Experimental results show that, first, our HSSD algorithm can learn shape dictionary and provide shape cues in addition to the 2D cues. Using the proposed HSSD algorithm achieves 84% accuracy on a household RGB-D object dataset and outperforms a widely used VFH shape feature by 13%. Second, fusing RGB-D information at lower level does not improve recognition performance.</description><subject>Accuracy</subject><subject>Dictionaries</subject><subject>Encoding</subject><subject>Feature extraction</subject><subject>Filter banks</subject><subject>Histograms</subject><subject>Shape</subject><issn>2153-0858</issn><issn>2153-0866</issn><isbn>1467317373</isbn><isbn>9781467317375</isbn><isbn>9781467317351</isbn><isbn>1467317365</isbn><isbn>9781467317368</isbn><isbn>1467317357</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9kF1LwzAYheMXOOd-gHiTP9CaN2m-LnXTORgMpl6PNH27Zsy2pAHZv3fi9OpcPIeHwyHkDlgOwOzDYr16yzkDnithFGh5RiZWGyiUFqCFhHMy4iBFxoxSF-TmD2hx-Q-kuSaTYdgxxo5OJcCOyGqJLrah3dImYHTRN8G7PY3YRxywTS6FrqVfITV06F0cQjrQuot0PX_KZrQrd-jTse27bRt-qrfkqnb7ASenHJOPl-f36Wu2XM0X08dlFo7bU-bQOFsxZLysrC-MZJpZrbGqEE1VlBKNqTw3vAZVe5Re6tIDF7b2IFzBxJjc_3oDIm76GD5dPGxO34hvHwlU0g</recordid><startdate>201210</startdate><enddate>201210</enddate><creator>Kuan-Ting Yu</creator><creator>Shih-Huan Tseng</creator><creator>Li-Chen Fu</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201210</creationdate><title>Learning hierarchical representation with sparsity for RGB-D object recognition</title><author>Kuan-Ting Yu ; Shih-Huan Tseng ; Li-Chen Fu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-ae8a9d0e02bd9c485070977eddee8d4b5e88dc282f16fce5c57bc1239fc13a403</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Accuracy</topic><topic>Dictionaries</topic><topic>Encoding</topic><topic>Feature extraction</topic><topic>Filter banks</topic><topic>Histograms</topic><topic>Shape</topic><toplevel>online_resources</toplevel><creatorcontrib>Kuan-Ting Yu</creatorcontrib><creatorcontrib>Shih-Huan Tseng</creatorcontrib><creatorcontrib>Li-Chen Fu</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kuan-Ting Yu</au><au>Shih-Huan Tseng</au><au>Li-Chen Fu</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning hierarchical representation with sparsity for RGB-D object recognition</atitle><btitle>2012 IEEE/RSJ International Conference on Intelligent Robots and Systems</btitle><stitle>IROS</stitle><date>2012-10</date><risdate>2012</risdate><spage>3011</spage><epage>3016</epage><pages>3011-3016</pages><issn>2153-0858</issn><eissn>2153-0866</eissn><isbn>1467317373</isbn><isbn>9781467317375</isbn><eisbn>9781467317351</eisbn><eisbn>1467317365</eisbn><eisbn>9781467317368</eisbn><eisbn>1467317357</eisbn><abstract>RGB-D sensor has gained its popularity in the study of object recognition for its low cost as well as its capability to provide synchronized RGB and depth images. Thus, researchers have proposed new methods to extract features from RGB-D data. On the other hand, learning-based feature representation is a promising approach for 2D image classification. By exploiting sparsity in 2D image signals, we can learn image representation instead of using hand-crafted local descriptors like SIFT or HoG. This framework inspired us to learn features from RGB-D data. Our work focuses on two goals. First, we propose a novel Hierarchical Sparse Shape Descriptor (HSSD) to form learning-based representation for 3D shapes. To achieve this, we analyze several 3D feature extraction techniques and propose a unified view of them. Then, we learn hierarchical shape representation with sparse coding, max pooling and local grouping. Second, we investigate whether RGB and depth information should be fused at lower level or higher level. Experimental results show that, first, our HSSD algorithm can learn shape dictionary and provide shape cues in addition to the 2D cues. Using the proposed HSSD algorithm achieves 84% accuracy on a household RGB-D object dataset and outperforms a widely used VFH shape feature by 13%. Second, fusing RGB-D information at lower level does not improve recognition performance.</abstract><pub>IEEE</pub><doi>10.1109/IROS.2012.6386175</doi><tpages>6</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2153-0858
ispartof	2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, p.3011-3016
issn	2153-0858 2153-0866
language	eng
recordid	cdi_ieee_primary_6386175
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Accuracy Dictionaries Encoding Feature extraction Filter banks Histograms Shape
title	Learning hierarchical representation with sparsity for RGB-D object recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T22%3A43%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20hierarchical%20representation%20with%20sparsity%20for%20RGB-D%20object%20recognition&rft.btitle=2012%20IEEE/RSJ%20International%20Conference%20on%20Intelligent%20Robots%20and%20Systems&rft.au=Kuan-Ting%20Yu&rft.date=2012-10&rft.spage=3011&rft.epage=3016&rft.pages=3011-3016&rft.issn=2153-0858&rft.eissn=2153-0866&rft.isbn=1467317373&rft.isbn_list=9781467317375&rft_id=info:doi/10.1109/IROS.2012.6386175&rft_dat=%3Cieee_6IE%3E6386175%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467317351&rft.eisbn_list=1467317365&rft.eisbn_list=9781467317368&rft.eisbn_list=1467317357&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6386175&rfr_iscdi=true