Multi-granularity representation learning for sketch-based dynamic face image retrieval

In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-01, Vol.55 (1), p.54, Article 54
Hauptverfasser: Wang, Liang, Dai, Dawei, Fu, Shiyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-024-05893-1