Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels
Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-aut...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2024-12, Vol.46 (12), p.9255-9271 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 9271 |
---|---|
container_issue | 12 |
container_start_page | 9255 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 46 |
creator | Pan, Cong Peng, Junran Bu, Xingyuan Zhang, Zhaoxiang |
description | Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-automatic collection and labeling process, designed to manage the huge data scale, leads to label-related problems, including explicit or implicit multiple labels per object and highly imbalanced label distribution. In this work, we quantitatively analyze the major problems in large-scale object detection and provide a detailed yet comprehensive demonstration of our solutions. First, we design a concurrent softmax to handle the multi-label problems in object detection and propose a soft-balance sampling method with a hybrid training scheduler to address the label imbalance. This approach yields a notable improvement of 3.34 points, achieving the best single-model performance with a mAP of 60.90% on the public object detection test set of Open Images. Then, we introduce a well-designed ensemble mechanism that substantially enhances the performance of the single model, achieving an overall mAP of 67.17%, which is 4.29 points higher than the best result from the Open Images public test 2018. |
doi_str_mv | 10.1109/TPAMI.2024.3421300 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_3074718114</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10579784</ieee_id><sourcerecordid>3074718114</sourcerecordid><originalsourceid>FETCH-LOGICAL-c205t-9671218b65743d878ce67d994ab9c4a57e3801634a406fbab00907a89369adb43</originalsourceid><addsrcrecordid>eNpNkMtOwzAQRS0EoqXwAwghL1mQYsdObC-rlkelVEWiiA1SZCdT6iqPEjsL_p6EFsRmZnPu1cxB6JKSMaVE3a2eJ4v5OCQhHzMeUkbIERpSxVTAIqaO0ZDQOAykDOUAnTm3JYTyiLBTNGBScaW4GKL3RDcfELxkugC8NFvIPJ6B75atK2wr7DeA32yRd8Nv8Lw0utBVBjmeaa_xzDrfWNP29C3WVY4XbeFtkGgDhTtHJ2tdOLg47BF6fbhfTZ-CZPk4n06SIAtJ5AMVCxpSaeJIcJZLITOIRd7dp43KuI4EMNm9wrjmJF4bbQhRRGipWKx0bjgboZt9766pP1twPi2ty6DoLoW6dSkjggsqKe3RcI9mTe1cA-t019hSN18pJWlvNf2xmvZW04PVLnR96G9NCflf5FdjB1ztAQsA_xojoYTk7BuPK3n-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3074718114</pqid></control><display><type>article</type><title>Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels</title><source>IEEE Electronic Library (IEL)</source><creator>Pan, Cong ; Peng, Junran ; Bu, Xingyuan ; Zhang, Zhaoxiang</creator><creatorcontrib>Pan, Cong ; Peng, Junran ; Bu, Xingyuan ; Zhang, Zhaoxiang</creatorcontrib><description>Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-automatic collection and labeling process, designed to manage the huge data scale, leads to label-related problems, including explicit or implicit multiple labels per object and highly imbalanced label distribution. In this work, we quantitatively analyze the major problems in large-scale object detection and provide a detailed yet comprehensive demonstration of our solutions. First, we design a concurrent softmax to handle the multi-label problems in object detection and propose a soft-balance sampling method with a hybrid training scheduler to address the label imbalance. This approach yields a notable improvement of 3.34 points, achieving the best single-model performance with a mAP of 60.90% on the public object detection test set of Open Images. Then, we introduce a well-designed ensemble mechanism that substantially enhances the performance of the single model, achieving an overall mAP of 67.17%, which is 4.29 points higher than the best result from the Open Images public test 2018.</description><identifier>ISSN: 0162-8828</identifier><identifier>ISSN: 1939-3539</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2024.3421300</identifier><identifier>PMID: 38949947</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Annotations ; Automobiles ; Computer vision ; Deep learning ; Detectors ; long-tail distribution ; multi-labels ; noisy labels ; Object detection ; Toy manufacturing industry ; Training</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2024-12, Vol.46 (12), p.9255-9271</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c205t-9671218b65743d878ce67d994ab9c4a57e3801634a406fbab00907a89369adb43</cites><orcidid>0000-0001-5959-4294 ; 0000-0001-5276-0114 ; 0000-0003-2648-3875</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10579784$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10579784$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38949947$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pan, Cong</creatorcontrib><creatorcontrib>Peng, Junran</creatorcontrib><creatorcontrib>Bu, Xingyuan</creatorcontrib><creatorcontrib>Zhang, Zhaoxiang</creatorcontrib><title>Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-automatic collection and labeling process, designed to manage the huge data scale, leads to label-related problems, including explicit or implicit multiple labels per object and highly imbalanced label distribution. In this work, we quantitatively analyze the major problems in large-scale object detection and provide a detailed yet comprehensive demonstration of our solutions. First, we design a concurrent softmax to handle the multi-label problems in object detection and propose a soft-balance sampling method with a hybrid training scheduler to address the label imbalance. This approach yields a notable improvement of 3.34 points, achieving the best single-model performance with a mAP of 60.90% on the public object detection test set of Open Images. Then, we introduce a well-designed ensemble mechanism that substantially enhances the performance of the single model, achieving an overall mAP of 67.17%, which is 4.29 points higher than the best result from the Open Images public test 2018.</description><subject>Annotations</subject><subject>Automobiles</subject><subject>Computer vision</subject><subject>Deep learning</subject><subject>Detectors</subject><subject>long-tail distribution</subject><subject>multi-labels</subject><subject>noisy labels</subject><subject>Object detection</subject><subject>Toy manufacturing industry</subject><subject>Training</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtOwzAQRS0EoqXwAwghL1mQYsdObC-rlkelVEWiiA1SZCdT6iqPEjsL_p6EFsRmZnPu1cxB6JKSMaVE3a2eJ4v5OCQhHzMeUkbIERpSxVTAIqaO0ZDQOAykDOUAnTm3JYTyiLBTNGBScaW4GKL3RDcfELxkugC8NFvIPJ6B75atK2wr7DeA32yRd8Nv8Lw0utBVBjmeaa_xzDrfWNP29C3WVY4XbeFtkGgDhTtHJ2tdOLg47BF6fbhfTZ-CZPk4n06SIAtJ5AMVCxpSaeJIcJZLITOIRd7dp43KuI4EMNm9wrjmJF4bbQhRRGipWKx0bjgboZt9766pP1twPi2ty6DoLoW6dSkjggsqKe3RcI9mTe1cA-t019hSN18pJWlvNf2xmvZW04PVLnR96G9NCflf5FdjB1ztAQsA_xojoYTk7BuPK3n-</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Pan, Cong</creator><creator>Peng, Junran</creator><creator>Bu, Xingyuan</creator><creator>Zhang, Zhaoxiang</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-5959-4294</orcidid><orcidid>https://orcid.org/0000-0001-5276-0114</orcidid><orcidid>https://orcid.org/0000-0003-2648-3875</orcidid></search><sort><creationdate>202412</creationdate><title>Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels</title><author>Pan, Cong ; Peng, Junran ; Bu, Xingyuan ; Zhang, Zhaoxiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c205t-9671218b65743d878ce67d994ab9c4a57e3801634a406fbab00907a89369adb43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Automobiles</topic><topic>Computer vision</topic><topic>Deep learning</topic><topic>Detectors</topic><topic>long-tail distribution</topic><topic>multi-labels</topic><topic>noisy labels</topic><topic>Object detection</topic><topic>Toy manufacturing industry</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pan, Cong</creatorcontrib><creatorcontrib>Peng, Junran</creatorcontrib><creatorcontrib>Bu, Xingyuan</creatorcontrib><creatorcontrib>Zhang, Zhaoxiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pan, Cong</au><au>Peng, Junran</au><au>Bu, Xingyuan</au><au>Zhang, Zhaoxiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2024-12</date><risdate>2024</risdate><volume>46</volume><issue>12</issue><spage>9255</spage><epage>9271</epage><pages>9255-9271</pages><issn>0162-8828</issn><issn>1939-3539</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Training with more data has always been the most stable and effective way of improving performance in the deep learning era. The Open Images dataset, the largest object detection dataset, presents significant opportunities and challenges for general and sophisticated scenarios. However, its semi-automatic collection and labeling process, designed to manage the huge data scale, leads to label-related problems, including explicit or implicit multiple labels per object and highly imbalanced label distribution. In this work, we quantitatively analyze the major problems in large-scale object detection and provide a detailed yet comprehensive demonstration of our solutions. First, we design a concurrent softmax to handle the multi-label problems in object detection and propose a soft-balance sampling method with a hybrid training scheduler to address the label imbalance. This approach yields a notable improvement of 3.34 points, achieving the best single-model performance with a mAP of 60.90% on the public object detection test set of Open Images. Then, we introduce a well-designed ensemble mechanism that substantially enhances the performance of the single model, achieving an overall mAP of 67.17%, which is 4.29 points higher than the best result from the Open Images public test 2018.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>38949947</pmid><doi>10.1109/TPAMI.2024.3421300</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0001-5959-4294</orcidid><orcidid>https://orcid.org/0000-0001-5276-0114</orcidid><orcidid>https://orcid.org/0000-0003-2648-3875</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2024-12, Vol.46 (12), p.9255-9271 |
issn | 0162-8828 1939-3539 1939-3539 2160-9292 |
language | eng |
recordid | cdi_proquest_miscellaneous_3074718114 |
source | IEEE Electronic Library (IEL) |
subjects | Annotations Automobiles Computer vision Deep learning Detectors long-tail distribution multi-labels noisy labels Object detection Toy manufacturing industry Training |
title | Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T02%3A20%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Large-Scale%20Object%20Detection%20in%20the%20Wild%20With%20Imbalanced%20Data%20Distribution,%20and%20Multi-Labels&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Pan,%20Cong&rft.date=2024-12&rft.volume=46&rft.issue=12&rft.spage=9255&rft.epage=9271&rft.pages=9255-9271&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2024.3421300&rft_dat=%3Cproquest_RIE%3E3074718114%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3074718114&rft_id=info:pmid/38949947&rft_ieee_id=10579784&rfr_iscdi=true |