DiffusionDet: Diffusion Model for Object Detection
We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In infere...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Chen, Shoufa Sun, Peize Song, Yibing Luo, Ping |
description | We propose DiffusionDet, a new framework that formulates object detection as
a denoising diffusion process from noisy boxes to object boxes. During the
training stage, object boxes diffuse from ground-truth boxes to random
distribution, and the model learns to reverse this noising process. In
inference, the model refines a set of randomly generated boxes to the output
results in a progressive way. Our work possesses an appealing property of
flexibility, which enables the dynamic number of boxes and iterative
evaluation. The extensive experiments on the standard benchmarks show that
DiffusionDet achieves favorable performance compared to previous
well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8
AP gains when evaluated with more boxes and iteration steps, under a zero-shot
transfer setting from COCO to CrowdHuman. Our code is available at
https://github.com/ShoufaChen/DiffusionDet. |
doi_str_mv | 10.48550/arxiv.2211.09788 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2211_09788</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2211_09788</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-55aa26eb2522d7b18c3fb3c9b881322707c83e0b4bf8624703a7aa80e179002b3</originalsourceid><addsrcrecordid>eNo9jsGKwjAURbNxMagfMCvzA60vL03z6k7U0QHFjfvyUhOoVCutiv69HZVZHS4HLkeIbwVxQsbAmJt7eYsRlYohs0RfAudlCNe2rE9zf5nI_yU39d5XMtSN3LqDLy6y8x06NRC9wFXrhx_2xe5nsZutovV2-TubriNOLUXGMGPqHRrEvXWKCh2cLjJHpDSiBVuQ9uASFyjFxIJmy0zglc0A0Om-GL1vX9H5uSmP3Dzyv_j8Fa-fa189lg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DiffusionDet: Diffusion Model for Object Detection</title><source>arXiv.org</source><creator>Chen, Shoufa ; Sun, Peize ; Song, Yibing ; Luo, Ping</creator><creatorcontrib>Chen, Shoufa ; Sun, Peize ; Song, Yibing ; Luo, Ping</creatorcontrib><description>We propose DiffusionDet, a new framework that formulates object detection as
a denoising diffusion process from noisy boxes to object boxes. During the
training stage, object boxes diffuse from ground-truth boxes to random
distribution, and the model learns to reverse this noising process. In
inference, the model refines a set of randomly generated boxes to the output
results in a progressive way. Our work possesses an appealing property of
flexibility, which enables the dynamic number of boxes and iterative
evaluation. The extensive experiments on the standard benchmarks show that
DiffusionDet achieves favorable performance compared to previous
well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8
AP gains when evaluated with more boxes and iteration steps, under a zero-shot
transfer setting from COCO to CrowdHuman. Our code is available at
https://github.com/ShoufaChen/DiffusionDet.</description><identifier>DOI: 10.48550/arxiv.2211.09788</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2022-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2211.09788$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2211.09788$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Shoufa</creatorcontrib><creatorcontrib>Sun, Peize</creatorcontrib><creatorcontrib>Song, Yibing</creatorcontrib><creatorcontrib>Luo, Ping</creatorcontrib><title>DiffusionDet: Diffusion Model for Object Detection</title><description>We propose DiffusionDet, a new framework that formulates object detection as
a denoising diffusion process from noisy boxes to object boxes. During the
training stage, object boxes diffuse from ground-truth boxes to random
distribution, and the model learns to reverse this noising process. In
inference, the model refines a set of randomly generated boxes to the output
results in a progressive way. Our work possesses an appealing property of
flexibility, which enables the dynamic number of boxes and iterative
evaluation. The extensive experiments on the standard benchmarks show that
DiffusionDet achieves favorable performance compared to previous
well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8
AP gains when evaluated with more boxes and iteration steps, under a zero-shot
transfer setting from COCO to CrowdHuman. Our code is available at
https://github.com/ShoufaChen/DiffusionDet.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9jsGKwjAURbNxMagfMCvzA60vL03z6k7U0QHFjfvyUhOoVCutiv69HZVZHS4HLkeIbwVxQsbAmJt7eYsRlYohs0RfAudlCNe2rE9zf5nI_yU39d5XMtSN3LqDLy6y8x06NRC9wFXrhx_2xe5nsZutovV2-TubriNOLUXGMGPqHRrEvXWKCh2cLjJHpDSiBVuQ9uASFyjFxIJmy0zglc0A0Om-GL1vX9H5uSmP3Dzyv_j8Fa-fa189lg</recordid><startdate>20221117</startdate><enddate>20221117</enddate><creator>Chen, Shoufa</creator><creator>Sun, Peize</creator><creator>Song, Yibing</creator><creator>Luo, Ping</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20221117</creationdate><title>DiffusionDet: Diffusion Model for Object Detection</title><author>Chen, Shoufa ; Sun, Peize ; Song, Yibing ; Luo, Ping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-55aa26eb2522d7b18c3fb3c9b881322707c83e0b4bf8624703a7aa80e179002b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Shoufa</creatorcontrib><creatorcontrib>Sun, Peize</creatorcontrib><creatorcontrib>Song, Yibing</creatorcontrib><creatorcontrib>Luo, Ping</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Shoufa</au><au>Sun, Peize</au><au>Song, Yibing</au><au>Luo, Ping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DiffusionDet: Diffusion Model for Object Detection</atitle><date>2022-11-17</date><risdate>2022</risdate><abstract>We propose DiffusionDet, a new framework that formulates object detection as
a denoising diffusion process from noisy boxes to object boxes. During the
training stage, object boxes diffuse from ground-truth boxes to random
distribution, and the model learns to reverse this noising process. In
inference, the model refines a set of randomly generated boxes to the output
results in a progressive way. Our work possesses an appealing property of
flexibility, which enables the dynamic number of boxes and iterative
evaluation. The extensive experiments on the standard benchmarks show that
DiffusionDet achieves favorable performance compared to previous
well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8
AP gains when evaluated with more boxes and iteration steps, under a zero-shot
transfer setting from COCO to CrowdHuman. Our code is available at
https://github.com/ShoufaChen/DiffusionDet.</abstract><doi>10.48550/arxiv.2211.09788</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2211.09788 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2211_09788 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | DiffusionDet: Diffusion Model for Object Detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T06%3A10%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DiffusionDet:%20Diffusion%20Model%20for%20Object%20Detection&rft.au=Chen,%20Shoufa&rft.date=2022-11-17&rft_id=info:doi/10.48550/arxiv.2211.09788&rft_dat=%3Carxiv_GOX%3E2211_09788%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |