Multi-granularity representation learning for sketch-based dynamic face image retrieval

In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-01, Vol.55 (1), p.54, Article 54
Hauptverfasser: Wang, Liang, Dai, Dawei, Fu, Shiyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page 54
container_title Applied intelligence (Dordrecht, Netherlands)
container_volume 55
creator Wang, Liang
Dai, Dawei
Fu, Shiyu
description In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL
doi_str_mv 10.1007/s10489-024-05893-1
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3134195304</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3134195304</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</originalsourceid><addsrcrecordid>eNotkE1LAzEQhoMoWKt_wNOC5-jke3OUolaoeCnoLWSTbN26zdYkK_Tfu1pPc5jnfWd4ELomcEsA1F0mwGuNgXIMotYMkxM0I0IxrLhWp2gGelpJqd_P0UXOWwBgDMgMvb2MfenwJtk49jZ15VClsE8hh1hs6YZY9cGm2MVN1Q6pyp-huA_c2Bx85Q_R7jpXtdaFqtvZTZiyJXXh2_aX6Ky1fQ5X_3OO1o8P68USr16fnhf3K-yIkAU7C7y13nMPlBIlPa1Baik5WOe9Cm2tmBdNI5jkWnrha-IFtYzSptGtZHN0c6zdp-FrDLmY7TCmOF00jDBOtGDAJ4oeKZeGnFNozT5N_6aDIWB-_ZmjPzP5M3_-DGE_vh9kdw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3134195304</pqid></control><display><type>article</type><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</creator><creatorcontrib>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</creatorcontrib><description>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-05893-1</identifier><language>eng</language><publisher>Boston: Springer Nature B.V</publisher><subject>Image retrieval ; Learning ; Representations ; Retrieval ; Sketches</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2025-01, Vol.55 (1), p.54, Article 54</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</cites><orcidid>0000-0002-8431-4431</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Dai, Dawei</creatorcontrib><creatorcontrib>Fu, Shiyu</creatorcontrib><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><title>Applied intelligence (Dordrecht, Netherlands)</title><description>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</description><subject>Image retrieval</subject><subject>Learning</subject><subject>Representations</subject><subject>Retrieval</subject><subject>Sketches</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkE1LAzEQhoMoWKt_wNOC5-jke3OUolaoeCnoLWSTbN26zdYkK_Tfu1pPc5jnfWd4ELomcEsA1F0mwGuNgXIMotYMkxM0I0IxrLhWp2gGelpJqd_P0UXOWwBgDMgMvb2MfenwJtk49jZ15VClsE8hh1hs6YZY9cGm2MVN1Q6pyp-huA_c2Bx85Q_R7jpXtdaFqtvZTZiyJXXh2_aX6Ky1fQ5X_3OO1o8P68USr16fnhf3K-yIkAU7C7y13nMPlBIlPa1Baik5WOe9Cm2tmBdNI5jkWnrha-IFtYzSptGtZHN0c6zdp-FrDLmY7TCmOF00jDBOtGDAJ4oeKZeGnFNozT5N_6aDIWB-_ZmjPzP5M3_-DGE_vh9kdw</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Wang, Liang</creator><creator>Dai, Dawei</creator><creator>Fu, Shiyu</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8431-4431</orcidid></search><sort><creationdate>202501</creationdate><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><author>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Image retrieval</topic><topic>Learning</topic><topic>Representations</topic><topic>Retrieval</topic><topic>Sketches</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Dai, Dawei</creatorcontrib><creatorcontrib>Fu, Shiyu</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Liang</au><au>Dai, Dawei</au><au>Fu, Shiyu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-granularity representation learning for sketch-based dynamic face image retrieval</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><date>2025-01</date><risdate>2025</risdate><volume>55</volume><issue>1</issue><spage>54</spage><pages>54-</pages><artnum>54</artnum><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</abstract><cop>Boston</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10489-024-05893-1</doi><orcidid>https://orcid.org/0000-0002-8431-4431</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0924-669X
ispartof Applied intelligence (Dordrecht, Netherlands), 2025-01, Vol.55 (1), p.54, Article 54
issn 0924-669X
1573-7497
language eng
recordid cdi_proquest_journals_3134195304
source SpringerLink Journals - AutoHoldings
subjects Image retrieval
Learning
Representations
Retrieval
Sketches
title Multi-granularity representation learning for sketch-based dynamic face image retrieval
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T19%3A13%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-granularity%20representation%20learning%20for%20sketch-based%20dynamic%20face%20image%20retrieval&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Wang,%20Liang&rft.date=2025-01&rft.volume=55&rft.issue=1&rft.spage=54&rft.pages=54-&rft.artnum=54&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-05893-1&rft_dat=%3Cproquest_cross%3E3134195304%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3134195304&rft_id=info:pmid/&rfr_iscdi=true