Analysis of Fine-grained Counting Methods for Masked Face Counting: A Comparative Study

Masked face counting is the counting of faces at various crowd densities and discriminating between masked and unmasked faces, which is generally considered to be an object (i.e., face) detection task. Counting accuracy is limited, especially at higher densities, when the faces are relatively small,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024-01, Vol.12, p.1-1
Hauptverfasser: Nguyen, Khanh-Duy, Nguyen, Huy H., Le, Trung-Nghia, Yamagishi, Junichi, Echizen, Isao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Masked face counting is the counting of faces at various crowd densities and discriminating between masked and unmasked faces, which is generally considered to be an object (i.e., face) detection task. Counting accuracy is limited, especially at higher densities, when the faces are relatively small, unclear, and viewed at various angles. Furthermore, it is costly to create the ground-truth bounding boxes needed to train object detection methods. We formulate masked face detection as a fine-grained crowd-counting task, which is appropriate for tackling this challenging task when used with density map regression. However, adopting fine-grained crowd-counting methods for masked face counting is not trivial. It is necessary to identify strategies appropriate for both counting and multi-class classification. We contrasted the strategies of various approaches and examined their benefits and drawbacks. These strategies include (1) simple regression with mixed regression and detection for counting, (2) using class-aware density maps with semantic segmentation maps and class probabilities for classification, and (3) counting with or without depth information enhancement. Analysis of seven crowd-counting methods on three datasets with a total of about 900k annotations demonstrated that the level of congestion affects how well simple regression and mixed regression and detection work for counting. Meanwhile, the most effective approach for classification is using semantic segmentation maps. Evaluation of the usefulness of using depth data demonstrated the need for a depth map to achieve accurate counting. These findings should be useful for future studies.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3367593