Portrait Interpretation and a Benchmark
We propose a task we name Portrait Interpretation and construct a dataset named Portrait250K for it. Current researches on portraits such as human attribute recognition and person re-identification have achieved many successes, but generally, they: 1) may lack mining the interrelationship between va...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a task we name Portrait Interpretation and construct a dataset
named Portrait250K for it. Current researches on portraits such as human
attribute recognition and person re-identification have achieved many
successes, but generally, they: 1) may lack mining the interrelationship
between various tasks and the possible benefits it may bring; 2) design deep
models specifically for each task, which is inefficient; 3) may be unable to
cope with the needs of a unified model and comprehensive perception in actual
scenes. In this paper, the proposed portrait interpretation recognizes the
perception of humans from a new systematic perspective. We divide the
perception of portraits into three aspects, namely Appearance, Posture, and
Emotion, and design corresponding sub-tasks for each aspect. Based on the
framework of multi-task learning, portrait interpretation requires a
comprehensive description of static attributes and dynamic states of portraits.
To invigorate research on this new task, we construct a new dataset that
contains 250,000 images labeled with identity, gender, age, physique, height,
expression, and posture of the whole body and arms. Our dataset is collected
from 51 movies, hence covering extensive diversity. Furthermore, we focus on
representation learning for portrait interpretation and propose a baseline that
reflects our systematic perspective. We also propose an appropriate metric for
this task. Our experimental results demonstrate that combining the tasks
related to portrait interpretation can yield benefits. Code and dataset will be
made public. |
---|---|
DOI: | 10.48550/arxiv.2207.13315 |