Boosting Deep Open World Recognition by Clustering

While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE robotics and automation letters 2020-10, Vol.5 (4), p.5985-5992
Hauptverfasser: Fontanel, Dario, Cermelli, Fabio, Mancini, Massimiliano, Bulo, Samuel Rota, Ricci, Elisa, Caputo, Barbara
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5992
container_issue 4
container_start_page 5985
container_title IEEE robotics and automation letters
container_volume 5
creator Fontanel, Dario
Cermelli, Fabio
Mancini, Massimiliano
Bulo, Samuel Rota
Ricci, Elisa
Caputo, Barbara
description While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world . To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e., open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e., incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e., global clustering , forces the network to map samples closer to the class centroid they belong to while the second one, local clustering , shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on three benchmarks show the effectiveness of our approach.
doi_str_mv 10.1109/LRA.2020.3010753
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2429903792</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9145605</ieee_id><sourcerecordid>2429903792</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-69838d5a0d35d17a49d9c6f3b784395de07877fde145938b3a4afe7a9ba5c433</originalsourceid><addsrcrecordid>eNpNkD1rwzAQhkVpoSHNXuhi6Oz0pLMsa0zTTwgEQqCjkO1zcEgtV7KH_PsqJJROd8Pz3ns8jN1zmHMO-mm1WcwFCJgjcFASr9hEoFIpqjy__rffslkIewDgUijUcsLEs3NhaLtd8kLUJ-ueuuTL-UOdbKhyu64dWtcl5TFZHsYwkI_kHbtp7CHQ7DKnbPv2ul1-pKv1--dysUorofmQ5rrAopYWapQ1VzbTta7yBktVZLG6JlCFUk1NPJMaixJtZhtSVpdWVhnilD2ez_be_YwUBrN3o-9ioxGZ0BpQaREpOFOVdyF4akzv22_rj4aDObkx0Y05uTEXNzHycI60RPSH6_hGDhJ_AW9kXbg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2429903792</pqid></control><display><type>article</type><title>Boosting Deep Open World Recognition by Clustering</title><source>IEEE Electronic Library (IEL)</source><creator>Fontanel, Dario ; Cermelli, Fabio ; Mancini, Massimiliano ; Bulo, Samuel Rota ; Ricci, Elisa ; Caputo, Barbara</creator><creatorcontrib>Fontanel, Dario ; Cermelli, Fabio ; Mancini, Massimiliano ; Bulo, Samuel Rota ; Ricci, Elisa ; Caputo, Barbara</creatorcontrib><description>While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world . To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e., open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e., incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e., global clustering , forces the network to map samples closer to the class centroid they belong to while the second one, local clustering , shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on three benchmarks show the effectiveness of our approach.</description><identifier>ISSN: 2377-3766</identifier><identifier>EISSN: 2377-3766</identifier><identifier>DOI: 10.1109/LRA.2020.3010753</identifier><identifier>CODEN: IRALC6</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Artificial neural networks ; Centroids ; Clustering ; Deep Learning for visual perception ; Feature extraction ; Machine vision ; Neural networks ; Recognition ; Representations ; Robot vision systems ; Robots ; Semantics ; Training ; Vision systems ; visual learning ; Visualization</subject><ispartof>IEEE robotics and automation letters, 2020-10, Vol.5 (4), p.5985-5992</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-69838d5a0d35d17a49d9c6f3b784395de07877fde145938b3a4afe7a9ba5c433</citedby><cites>FETCH-LOGICAL-c291t-69838d5a0d35d17a49d9c6f3b784395de07877fde145938b3a4afe7a9ba5c433</cites><orcidid>0000-0002-2372-1367 ; 0000-0001-7169-0158 ; 0000-0002-8644-8281 ; 0000-0001-8595-9955 ; 0000-0001-7077-697X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9145605$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9145605$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fontanel, Dario</creatorcontrib><creatorcontrib>Cermelli, Fabio</creatorcontrib><creatorcontrib>Mancini, Massimiliano</creatorcontrib><creatorcontrib>Bulo, Samuel Rota</creatorcontrib><creatorcontrib>Ricci, Elisa</creatorcontrib><creatorcontrib>Caputo, Barbara</creatorcontrib><title>Boosting Deep Open World Recognition by Clustering</title><title>IEEE robotics and automation letters</title><addtitle>LRA</addtitle><description>While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world . To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e., open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e., incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e., global clustering , forces the network to map samples closer to the class centroid they belong to while the second one, local clustering , shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on three benchmarks show the effectiveness of our approach.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Centroids</subject><subject>Clustering</subject><subject>Deep Learning for visual perception</subject><subject>Feature extraction</subject><subject>Machine vision</subject><subject>Neural networks</subject><subject>Recognition</subject><subject>Representations</subject><subject>Robot vision systems</subject><subject>Robots</subject><subject>Semantics</subject><subject>Training</subject><subject>Vision systems</subject><subject>visual learning</subject><subject>Visualization</subject><issn>2377-3766</issn><issn>2377-3766</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkD1rwzAQhkVpoSHNXuhi6Oz0pLMsa0zTTwgEQqCjkO1zcEgtV7KH_PsqJJROd8Pz3ns8jN1zmHMO-mm1WcwFCJgjcFASr9hEoFIpqjy__rffslkIewDgUijUcsLEs3NhaLtd8kLUJ-ueuuTL-UOdbKhyu64dWtcl5TFZHsYwkI_kHbtp7CHQ7DKnbPv2ul1-pKv1--dysUorofmQ5rrAopYWapQ1VzbTta7yBktVZLG6JlCFUk1NPJMaixJtZhtSVpdWVhnilD2ez_be_YwUBrN3o-9ioxGZ0BpQaREpOFOVdyF4akzv22_rj4aDObkx0Y05uTEXNzHycI60RPSH6_hGDhJ_AW9kXbg</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Fontanel, Dario</creator><creator>Cermelli, Fabio</creator><creator>Mancini, Massimiliano</creator><creator>Bulo, Samuel Rota</creator><creator>Ricci, Elisa</creator><creator>Caputo, Barbara</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2372-1367</orcidid><orcidid>https://orcid.org/0000-0001-7169-0158</orcidid><orcidid>https://orcid.org/0000-0002-8644-8281</orcidid><orcidid>https://orcid.org/0000-0001-8595-9955</orcidid><orcidid>https://orcid.org/0000-0001-7077-697X</orcidid></search><sort><creationdate>20201001</creationdate><title>Boosting Deep Open World Recognition by Clustering</title><author>Fontanel, Dario ; Cermelli, Fabio ; Mancini, Massimiliano ; Bulo, Samuel Rota ; Ricci, Elisa ; Caputo, Barbara</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-69838d5a0d35d17a49d9c6f3b784395de07877fde145938b3a4afe7a9ba5c433</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Centroids</topic><topic>Clustering</topic><topic>Deep Learning for visual perception</topic><topic>Feature extraction</topic><topic>Machine vision</topic><topic>Neural networks</topic><topic>Recognition</topic><topic>Representations</topic><topic>Robot vision systems</topic><topic>Robots</topic><topic>Semantics</topic><topic>Training</topic><topic>Vision systems</topic><topic>visual learning</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fontanel, Dario</creatorcontrib><creatorcontrib>Cermelli, Fabio</creatorcontrib><creatorcontrib>Mancini, Massimiliano</creatorcontrib><creatorcontrib>Bulo, Samuel Rota</creatorcontrib><creatorcontrib>Ricci, Elisa</creatorcontrib><creatorcontrib>Caputo, Barbara</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE robotics and automation letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fontanel, Dario</au><au>Cermelli, Fabio</au><au>Mancini, Massimiliano</au><au>Bulo, Samuel Rota</au><au>Ricci, Elisa</au><au>Caputo, Barbara</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Boosting Deep Open World Recognition by Clustering</atitle><jtitle>IEEE robotics and automation letters</jtitle><stitle>LRA</stitle><date>2020-10-01</date><risdate>2020</risdate><volume>5</volume><issue>4</issue><spage>5985</spage><epage>5992</epage><pages>5985-5992</pages><issn>2377-3766</issn><eissn>2377-3766</eissn><coden>IRALC6</coden><abstract>While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world . To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e., open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e., incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e., global clustering , forces the network to map samples closer to the class centroid they belong to while the second one, local clustering , shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on three benchmarks show the effectiveness of our approach.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LRA.2020.3010753</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-2372-1367</orcidid><orcidid>https://orcid.org/0000-0001-7169-0158</orcidid><orcidid>https://orcid.org/0000-0002-8644-8281</orcidid><orcidid>https://orcid.org/0000-0001-8595-9955</orcidid><orcidid>https://orcid.org/0000-0001-7077-697X</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2377-3766
ispartof IEEE robotics and automation letters, 2020-10, Vol.5 (4), p.5985-5992
issn 2377-3766
2377-3766
language eng
recordid cdi_proquest_journals_2429903792
source IEEE Electronic Library (IEL)
subjects Algorithms
Artificial neural networks
Centroids
Clustering
Deep Learning for visual perception
Feature extraction
Machine vision
Neural networks
Recognition
Representations
Robot vision systems
Robots
Semantics
Training
Vision systems
visual learning
Visualization
title Boosting Deep Open World Recognition by Clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T17%3A10%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Boosting%20Deep%20Open%20World%20Recognition%20by%20Clustering&rft.jtitle=IEEE%20robotics%20and%20automation%20letters&rft.au=Fontanel,%20Dario&rft.date=2020-10-01&rft.volume=5&rft.issue=4&rft.spage=5985&rft.epage=5992&rft.pages=5985-5992&rft.issn=2377-3766&rft.eissn=2377-3766&rft.coden=IRALC6&rft_id=info:doi/10.1109/LRA.2020.3010753&rft_dat=%3Cproquest_RIE%3E2429903792%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2429903792&rft_id=info:pmid/&rft_ieee_id=9145605&rfr_iscdi=true