Indoor Scene Understanding with Geometric and Semantic Contexts

Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variabil...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of computer vision 2015-04, Vol.112 (2), p.204-220
Hauptverfasser:	Choi, Wongun, Chao, Yu-Wei, Pantofaru, Caroline, Savarese, Silvio
Format:	Artikel
Sprache:	eng
Schlagworte:	3-D technology Analysis Artificial Intelligence Classification Computer Imaging Computer Science Detectors Dining rooms Graphs Hypotheses Image detection Image Processing and Computer Vision Image processing systems Indoor Mathematical models Pattern Recognition Pattern Recognition and Graphics Semantics Studies Three dimensional Vision Visual
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	220
container_issue	2
container_start_page	204
container_title	International journal of computer vision
container_volume	112
creator	Choi, Wongun Chao, Yu-Wei Pantofaru, Caroline Savarese, Silvio
description	Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variability, different viewpoints and occlusions. We propose a method that can automatically learn the interactions among scene elements and apply them to the holistic understanding of indoor scenes from a single image. This interpretation is performed within a hierarchical interaction model which describes an image by a parse graph, thereby fusing together object detection, layout estimation and scene classification. At the root of the parse graph is the scene type and layout while the leaves are the individual detections of objects. In between is the core of the system, our 3D Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single image 3D scene understanding using both 2D and 3D metrics. The results demonstrate that our model with 3DGPs can provide robust estimation of scene type, 3D space, and 3D objects by leveraging the contextual relationships among the visual elements.
doi_str_mv	10.1007/s11263-014-0779-4
format	Article
fullrecord	<record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1677904269</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A425389488</galeid><sourcerecordid>A425389488</sourcerecordid><originalsourceid>FETCH-LOGICAL-c492t-733b25b0074fe1df4c638dcf130d52dda4afaf7856e83abfd8e8aaeb4cd857fc3</originalsourceid><addsrcrecordid>eNp1kc1LBCEchiUK2j7-gG4DXeow5eeMc4pYalsIgrbO4urPbWJHS12q_z5jOlQQHsQfzyOvvggdEXxGMG7PEyG0YTUmvMZt29V8C02IaFlNOBbbaII7imvRdGQX7aX0jDGmkrIJuph7G0KsFgY8VI_eQkxZe9v7VfXW56dqBmGAHHtTlWm1gEH7XA7T4DO853SAdpxeJzj83vfR4_XVw_Smvr2bzaeXt7XhHc11y9iSimVJyh0Q67hpmLTGEYatoNZqrp12rRQNSKaXzkqQWsOSGytF6wzbRyfjvS8xvG4gZTX0ycB6rT2ETVKkKa_GnDZdQY__oM9hE31JV6iGd5iIDhfqbKRWeg2q9y7kqE1ZFobeBA-uL_NLTgWTHZeyCKe_BDP-wEpvUlLzxf1vloysiSGlCE69xH7Q8UMRrL76UmNfqvSlvvpSvDh0dFJh_Qrij9j_Sp8755bT</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1664901590</pqid></control><display><type>article</type><title>Indoor Scene Understanding with Geometric and Semantic Contexts</title><source>SpringerLink Journals - AutoHoldings</source><creator>Choi, Wongun ; Chao, Yu-Wei ; Pantofaru, Caroline ; Savarese, Silvio</creator><creatorcontrib>Choi, Wongun ; Chao, Yu-Wei ; Pantofaru, Caroline ; Savarese, Silvio</creatorcontrib><description>Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variability, different viewpoints and occlusions. We propose a method that can automatically learn the interactions among scene elements and apply them to the holistic understanding of indoor scenes from a single image. This interpretation is performed within a hierarchical interaction model which describes an image by a parse graph, thereby fusing together object detection, layout estimation and scene classification. At the root of the parse graph is the scene type and layout while the leaves are the individual detections of objects. In between is the core of the system, our 3D Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single image 3D scene understanding using both 2D and 3D metrics. The results demonstrate that our model with 3DGPs can provide robust estimation of scene type, 3D space, and 3D objects by leveraging the contextual relationships among the visual elements.</description><identifier>ISSN: 0920-5691</identifier><identifier>EISSN: 1573-1405</identifier><identifier>DOI: 10.1007/s11263-014-0779-4</identifier><language>eng</language><publisher>Boston: Springer US</publisher><subject>3-D technology ; Analysis ; Artificial Intelligence ; Classification ; Computer Imaging ; Computer Science ; Detectors ; Dining rooms ; Graphs ; Hypotheses ; Image detection ; Image Processing and Computer Vision ; Image processing systems ; Indoor ; Mathematical models ; Pattern Recognition ; Pattern Recognition and Graphics ; Semantics ; Studies ; Three dimensional ; Vision ; Visual</subject><ispartof>International journal of computer vision, 2015-04, Vol.112 (2), p.204-220</ispartof><rights>Springer Science+Business Media New York 2014</rights><rights>COPYRIGHT 2015 Springer</rights><rights>Springer Science+Business Media New York 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c492t-733b25b0074fe1df4c638dcf130d52dda4afaf7856e83abfd8e8aaeb4cd857fc3</citedby><cites>FETCH-LOGICAL-c492t-733b25b0074fe1df4c638dcf130d52dda4afaf7856e83abfd8e8aaeb4cd857fc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11263-014-0779-4$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11263-014-0779-4$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Choi, Wongun</creatorcontrib><creatorcontrib>Chao, Yu-Wei</creatorcontrib><creatorcontrib>Pantofaru, Caroline</creatorcontrib><creatorcontrib>Savarese, Silvio</creatorcontrib><title>Indoor Scene Understanding with Geometric and Semantic Contexts</title><title>International journal of computer vision</title><addtitle>Int J Comput Vis</addtitle><description>Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variability, different viewpoints and occlusions. We propose a method that can automatically learn the interactions among scene elements and apply them to the holistic understanding of indoor scenes from a single image. This interpretation is performed within a hierarchical interaction model which describes an image by a parse graph, thereby fusing together object detection, layout estimation and scene classification. At the root of the parse graph is the scene type and layout while the leaves are the individual detections of objects. In between is the core of the system, our 3D Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single image 3D scene understanding using both 2D and 3D metrics. The results demonstrate that our model with 3DGPs can provide robust estimation of scene type, 3D space, and 3D objects by leveraging the contextual relationships among the visual elements.</description><subject>3-D technology</subject><subject>Analysis</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Detectors</subject><subject>Dining rooms</subject><subject>Graphs</subject><subject>Hypotheses</subject><subject>Image detection</subject><subject>Image Processing and Computer Vision</subject><subject>Image processing systems</subject><subject>Indoor</subject><subject>Mathematical models</subject><subject>Pattern Recognition</subject><subject>Pattern Recognition and Graphics</subject><subject>Semantics</subject><subject>Studies</subject><subject>Three dimensional</subject><subject>Vision</subject><subject>Visual</subject><issn>0920-5691</issn><issn>1573-1405</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kc1LBCEchiUK2j7-gG4DXeow5eeMc4pYalsIgrbO4urPbWJHS12q_z5jOlQQHsQfzyOvvggdEXxGMG7PEyG0YTUmvMZt29V8C02IaFlNOBbbaII7imvRdGQX7aX0jDGmkrIJuph7G0KsFgY8VI_eQkxZe9v7VfXW56dqBmGAHHtTlWm1gEH7XA7T4DO853SAdpxeJzj83vfR4_XVw_Smvr2bzaeXt7XhHc11y9iSimVJyh0Q67hpmLTGEYatoNZqrp12rRQNSKaXzkqQWsOSGytF6wzbRyfjvS8xvG4gZTX0ycB6rT2ETVKkKa_GnDZdQY__oM9hE31JV6iGd5iIDhfqbKRWeg2q9y7kqE1ZFobeBA-uL_NLTgWTHZeyCKe_BDP-wEpvUlLzxf1vloysiSGlCE69xH7Q8UMRrL76UmNfqvSlvvpSvDh0dFJh_Qrij9j_Sp8755bT</recordid><startdate>20150401</startdate><enddate>20150401</enddate><creator>Choi, Wongun</creator><creator>Chao, Yu-Wei</creator><creator>Pantofaru, Caroline</creator><creator>Savarese, Silvio</creator><general>Springer US</general><general>Springer</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope></search><sort><creationdate>20150401</creationdate><title>Indoor Scene Understanding with Geometric and Semantic Contexts</title><author>Choi, Wongun ; Chao, Yu-Wei ; Pantofaru, Caroline ; Savarese, Silvio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c492t-733b25b0074fe1df4c638dcf130d52dda4afaf7856e83abfd8e8aaeb4cd857fc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>3-D technology</topic><topic>Analysis</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Detectors</topic><topic>Dining rooms</topic><topic>Graphs</topic><topic>Hypotheses</topic><topic>Image detection</topic><topic>Image Processing and Computer Vision</topic><topic>Image processing systems</topic><topic>Indoor</topic><topic>Mathematical models</topic><topic>Pattern Recognition</topic><topic>Pattern Recognition and Graphics</topic><topic>Semantics</topic><topic>Studies</topic><topic>Three dimensional</topic><topic>Vision</topic><topic>Visual</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Choi, Wongun</creatorcontrib><creatorcontrib>Chao, Yu-Wei</creatorcontrib><creatorcontrib>Pantofaru, Caroline</creatorcontrib><creatorcontrib>Savarese, Silvio</creatorcontrib><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Access via ABI/INFORM (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal of computer vision</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Choi, Wongun</au><au>Chao, Yu-Wei</au><au>Pantofaru, Caroline</au><au>Savarese, Silvio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Indoor Scene Understanding with Geometric and Semantic Contexts</atitle><jtitle>International journal of computer vision</jtitle><stitle>Int J Comput Vis</stitle><date>2015-04-01</date><risdate>2015</risdate><volume>112</volume><issue>2</issue><spage>204</spage><epage>220</epage><pages>204-220</pages><issn>0920-5691</issn><eissn>1573-1405</eissn><abstract>Truly understanding a scene involves integrating information at multiple levels as well as studying the interactions between scene elements. Individual object detectors, layout estimators and scene classifiers are powerful but ultimately confounded by complicated real-world scenes with high variability, different viewpoints and occlusions. We propose a method that can automatically learn the interactions among scene elements and apply them to the holistic understanding of indoor scenes from a single image. This interpretation is performed within a hierarchical interaction model which describes an image by a parse graph, thereby fusing together object detection, layout estimation and scene classification. At the root of the parse graph is the scene type and layout while the leaves are the individual detections of objects. In between is the core of the system, our 3D Geometric Phrases (3DGP). We conduct extensive experimental evaluations on single image 3D scene understanding using both 2D and 3D metrics. The results demonstrate that our model with 3DGPs can provide robust estimation of scene type, 3D space, and 3D objects by leveraging the contextual relationships among the visual elements.</abstract><cop>Boston</cop><pub>Springer US</pub><doi>10.1007/s11263-014-0779-4</doi><tpages>17</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0920-5691
ispartof	International journal of computer vision, 2015-04, Vol.112 (2), p.204-220
issn	0920-5691 1573-1405
language	eng
recordid	cdi_proquest_miscellaneous_1677904269
source	SpringerLink Journals - AutoHoldings
subjects	3-D technology Analysis Artificial Intelligence Classification Computer Imaging Computer Science Detectors Dining rooms Graphs Hypotheses Image detection Image Processing and Computer Vision Image processing systems Indoor Mathematical models Pattern Recognition Pattern Recognition and Graphics Semantics Studies Three dimensional Vision Visual
title	Indoor Scene Understanding with Geometric and Semantic Contexts
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T03%3A44%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Indoor%20Scene%20Understanding%20with%20Geometric%20and%20Semantic%20Contexts&rft.jtitle=International%20journal%20of%20computer%20vision&rft.au=Choi,%20Wongun&rft.date=2015-04-01&rft.volume=112&rft.issue=2&rft.spage=204&rft.epage=220&rft.pages=204-220&rft.issn=0920-5691&rft.eissn=1573-1405&rft_id=info:doi/10.1007/s11263-014-0779-4&rft_dat=%3Cgale_proqu%3EA425389488%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1664901590&rft_id=info:pmid/&rft_galeid=A425389488&rfr_iscdi=true