One Metric to Measure them All: Localisation Recall Precision (LRP) for Evaluating Visual Detection Tasks
Despite being widely used as a performance measure for visual detection tasks, Average Precision (AP) is limited in (i) reflecting localisation quality, (ii) interpretability and (iii) robustness to the design choices regarding its computation, and its applicability to outputs without confidence sco...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite being widely used as a performance measure for visual detection
tasks, Average Precision (AP) is limited in (i) reflecting localisation
quality, (ii) interpretability and (iii) robustness to the design choices
regarding its computation, and its applicability to outputs without confidence
scores. Panoptic Quality (PQ), a measure proposed for evaluating panoptic
segmentation (Kirillov et al., 2019), does not suffer from these limitations
but is limited to panoptic segmentation. In this paper, we propose Localisation
Recall Precision (LRP) Error as the average matching error of a visual detector
computed based on both its localisation and classification qualities for a
given confidence score threshold. LRP Error, initially proposed only for object
detection by Oksuz et al. (2018), does not suffer from the aforementioned
limitations and is applicable to all visual detection tasks. We also introduce
Optimal LRP (oLRP) Error as the minimum LRP Error obtained over confidence
scores to evaluate visual detectors and obtain optimal thresholds for
deployment. We provide a detailed comparative analysis of LRP Error with AP and
PQ, and use nearly 100 state-of-the-art visual detectors from seven visual
detection tasks (i.e. object detection, keypoint detection, instance
segmentation, panoptic segmentation, visual relationship detection, zero-shot
detection and generalised zero-shot detection) using ten datasets to
empirically show that LRP Error provides richer and more discriminative
information than its counterparts. Code available at:
https://github.com/kemaloksuz/LRP-Error |
---|---|
DOI: | 10.48550/arxiv.2011.10772 |