Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection
We introduce a detection framework for dense crowd counting and eliminate the need for the prevalent density regression paradigm. Typical counting models predict crowd density for an image as opposed to detecting every person. These regression methods, in general, fail to localize persons accurate e...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Sam, Deepak Babu Peri, Skand Vishwanath Sundararaman, Mukuntha Narayanan Kamath, Amogh Babu, R. Venkatesh |
description | We introduce a detection framework for dense crowd counting and eliminate the
need for the prevalent density regression paradigm. Typical counting models
predict crowd density for an image as opposed to detecting every person. These
regression methods, in general, fail to localize persons accurate enough for
most applications other than counting. Hence, we adopt an architecture that
locates every person in the crowd, sizes the spotted heads with bounding box
and then counts them. Compared to normal object or face detectors, there exist
certain unique challenges in designing such a detection system. Some of them
are direct consequences of the huge diversity in dense crowds along with the
need to predict boxes contiguously. We solve these issues and develop our
LSC-CNN model, which can reliably detect heads of people across sparse to dense
crowds. LSC-CNN employs a multi-column architecture with top-down feedback
processing to better resolve persons and produce refined predictions at
multiple resolutions. Interestingly, the proposed training regime requires only
point head annotation, but can estimate approximate size information of heads.
We show that LSC-CNN not only has superior localization than existing density
regressors, but outperforms in counting as well. The code for our approach is
available at https://github.com/val-iisc/lsc-cnn. |
doi_str_mv | 10.48550/arxiv.1906.07538 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1906_07538</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1906_07538</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-60cdbb60751164ae657437c84c48cdc94caa929f64f442e39cb71008ed7395523</originalsourceid><addsrcrecordid>eNotj8tOwzAURL1hgQofwAp_AAl2_GZXhVelSLy6j5ybG2Qp2FWSBsrXEwqrkWakozmEXHCWS6sUu_bDV5hz7pjOmVHCnpKXKoGf8Iq-hW-kPra0TPs43dA1wH5Ylv5AX3FM_RziO33GtOuRhkhvMY5IyyF9tiOdg1-KCWEKKZ6Rk873I57_54ps7--25WNWPT1synWVeW1sphm0TaOXF5xr6VErI4UBK0FaaMFJ8N4VrtOyk7JA4aAxnDGLrRFOqUKsyOUf9uhU74bw4YdD_etWH93ED9_0SKA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection</title><source>arXiv.org</source><creator>Sam, Deepak Babu ; Peri, Skand Vishwanath ; Sundararaman, Mukuntha Narayanan ; Kamath, Amogh ; Babu, R. Venkatesh</creator><creatorcontrib>Sam, Deepak Babu ; Peri, Skand Vishwanath ; Sundararaman, Mukuntha Narayanan ; Kamath, Amogh ; Babu, R. Venkatesh</creatorcontrib><description>We introduce a detection framework for dense crowd counting and eliminate the
need for the prevalent density regression paradigm. Typical counting models
predict crowd density for an image as opposed to detecting every person. These
regression methods, in general, fail to localize persons accurate enough for
most applications other than counting. Hence, we adopt an architecture that
locates every person in the crowd, sizes the spotted heads with bounding box
and then counts them. Compared to normal object or face detectors, there exist
certain unique challenges in designing such a detection system. Some of them
are direct consequences of the huge diversity in dense crowds along with the
need to predict boxes contiguously. We solve these issues and develop our
LSC-CNN model, which can reliably detect heads of people across sparse to dense
crowds. LSC-CNN employs a multi-column architecture with top-down feedback
processing to better resolve persons and produce refined predictions at
multiple resolutions. Interestingly, the proposed training regime requires only
point head annotation, but can estimate approximate size information of heads.
We show that LSC-CNN not only has superior localization than existing density
regressors, but outperforms in counting as well. The code for our approach is
available at https://github.com/val-iisc/lsc-cnn.</description><identifier>DOI: 10.48550/arxiv.1906.07538</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2019-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1906.07538$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1906.07538$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Sam, Deepak Babu</creatorcontrib><creatorcontrib>Peri, Skand Vishwanath</creatorcontrib><creatorcontrib>Sundararaman, Mukuntha Narayanan</creatorcontrib><creatorcontrib>Kamath, Amogh</creatorcontrib><creatorcontrib>Babu, R. Venkatesh</creatorcontrib><title>Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection</title><description>We introduce a detection framework for dense crowd counting and eliminate the
need for the prevalent density regression paradigm. Typical counting models
predict crowd density for an image as opposed to detecting every person. These
regression methods, in general, fail to localize persons accurate enough for
most applications other than counting. Hence, we adopt an architecture that
locates every person in the crowd, sizes the spotted heads with bounding box
and then counts them. Compared to normal object or face detectors, there exist
certain unique challenges in designing such a detection system. Some of them
are direct consequences of the huge diversity in dense crowds along with the
need to predict boxes contiguously. We solve these issues and develop our
LSC-CNN model, which can reliably detect heads of people across sparse to dense
crowds. LSC-CNN employs a multi-column architecture with top-down feedback
processing to better resolve persons and produce refined predictions at
multiple resolutions. Interestingly, the proposed training regime requires only
point head annotation, but can estimate approximate size information of heads.
We show that LSC-CNN not only has superior localization than existing density
regressors, but outperforms in counting as well. The code for our approach is
available at https://github.com/val-iisc/lsc-cnn.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tOwzAURL1hgQofwAp_AAl2_GZXhVelSLy6j5ybG2Qp2FWSBsrXEwqrkWakozmEXHCWS6sUu_bDV5hz7pjOmVHCnpKXKoGf8Iq-hW-kPra0TPs43dA1wH5Ylv5AX3FM_RziO33GtOuRhkhvMY5IyyF9tiOdg1-KCWEKKZ6Rk873I57_54ps7--25WNWPT1synWVeW1sphm0TaOXF5xr6VErI4UBK0FaaMFJ8N4VrtOyk7JA4aAxnDGLrRFOqUKsyOUf9uhU74bw4YdD_etWH93ED9_0SKA</recordid><startdate>20190618</startdate><enddate>20190618</enddate><creator>Sam, Deepak Babu</creator><creator>Peri, Skand Vishwanath</creator><creator>Sundararaman, Mukuntha Narayanan</creator><creator>Kamath, Amogh</creator><creator>Babu, R. Venkatesh</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190618</creationdate><title>Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection</title><author>Sam, Deepak Babu ; Peri, Skand Vishwanath ; Sundararaman, Mukuntha Narayanan ; Kamath, Amogh ; Babu, R. Venkatesh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-60cdbb60751164ae657437c84c48cdc94caa929f64f442e39cb71008ed7395523</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Sam, Deepak Babu</creatorcontrib><creatorcontrib>Peri, Skand Vishwanath</creatorcontrib><creatorcontrib>Sundararaman, Mukuntha Narayanan</creatorcontrib><creatorcontrib>Kamath, Amogh</creatorcontrib><creatorcontrib>Babu, R. Venkatesh</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sam, Deepak Babu</au><au>Peri, Skand Vishwanath</au><au>Sundararaman, Mukuntha Narayanan</au><au>Kamath, Amogh</au><au>Babu, R. Venkatesh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection</atitle><date>2019-06-18</date><risdate>2019</risdate><abstract>We introduce a detection framework for dense crowd counting and eliminate the
need for the prevalent density regression paradigm. Typical counting models
predict crowd density for an image as opposed to detecting every person. These
regression methods, in general, fail to localize persons accurate enough for
most applications other than counting. Hence, we adopt an architecture that
locates every person in the crowd, sizes the spotted heads with bounding box
and then counts them. Compared to normal object or face detectors, there exist
certain unique challenges in designing such a detection system. Some of them
are direct consequences of the huge diversity in dense crowds along with the
need to predict boxes contiguously. We solve these issues and develop our
LSC-CNN model, which can reliably detect heads of people across sparse to dense
crowds. LSC-CNN employs a multi-column architecture with top-down feedback
processing to better resolve persons and produce refined predictions at
multiple resolutions. Interestingly, the proposed training regime requires only
point head annotation, but can estimate approximate size information of heads.
We show that LSC-CNN not only has superior localization than existing density
regressors, but outperforms in counting as well. The code for our approach is
available at https://github.com/val-iisc/lsc-cnn.</abstract><doi>10.48550/arxiv.1906.07538</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1906.07538 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1906_07538 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T17%3A31%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Locate,%20Size%20and%20Count:%20Accurately%20Resolving%20People%20in%20Dense%20Crowds%20via%20Detection&rft.au=Sam,%20Deepak%20Babu&rft.date=2019-06-18&rft_id=info:doi/10.48550/arxiv.1906.07538&rft_dat=%3Carxiv_GOX%3E1906_07538%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |