Multi-granularity representation learning for sketch-based dynamic face image retrieval
In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al....
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2025-01, Vol.55 (1), p.54, Article 54 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | 54 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 55 |
creator | Wang, Liang Dai, Dawei Fu, Shiyu |
description | In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL |
doi_str_mv | 10.1007/s10489-024-05893-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3134195304</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3134195304</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</originalsourceid><addsrcrecordid>eNotkE1LAzEQhoMoWKt_wNOC5-jke3OUolaoeCnoLWSTbN26zdYkK_Tfu1pPc5jnfWd4ELomcEsA1F0mwGuNgXIMotYMkxM0I0IxrLhWp2gGelpJqd_P0UXOWwBgDMgMvb2MfenwJtk49jZ15VClsE8hh1hs6YZY9cGm2MVN1Q6pyp-huA_c2Bx85Q_R7jpXtdaFqtvZTZiyJXXh2_aX6Ky1fQ5X_3OO1o8P68USr16fnhf3K-yIkAU7C7y13nMPlBIlPa1Baik5WOe9Cm2tmBdNI5jkWnrha-IFtYzSptGtZHN0c6zdp-FrDLmY7TCmOF00jDBOtGDAJ4oeKZeGnFNozT5N_6aDIWB-_ZmjPzP5M3_-DGE_vh9kdw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3134195304</pqid></control><display><type>article</type><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><source>SpringerLink Journals - AutoHoldings</source><creator>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</creator><creatorcontrib>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</creatorcontrib><description>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-05893-1</identifier><language>eng</language><publisher>Boston: Springer Nature B.V</publisher><subject>Image retrieval ; Learning ; Representations ; Retrieval ; Sketches</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2025-01, Vol.55 (1), p.54, Article 54</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</cites><orcidid>0000-0002-8431-4431</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Dai, Dawei</creatorcontrib><creatorcontrib>Fu, Shiyu</creatorcontrib><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><title>Applied intelligence (Dordrecht, Netherlands)</title><description>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</description><subject>Image retrieval</subject><subject>Learning</subject><subject>Representations</subject><subject>Retrieval</subject><subject>Sketches</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkE1LAzEQhoMoWKt_wNOC5-jke3OUolaoeCnoLWSTbN26zdYkK_Tfu1pPc5jnfWd4ELomcEsA1F0mwGuNgXIMotYMkxM0I0IxrLhWp2gGelpJqd_P0UXOWwBgDMgMvb2MfenwJtk49jZ15VClsE8hh1hs6YZY9cGm2MVN1Q6pyp-huA_c2Bx85Q_R7jpXtdaFqtvZTZiyJXXh2_aX6Ky1fQ5X_3OO1o8P68USr16fnhf3K-yIkAU7C7y13nMPlBIlPa1Baik5WOe9Cm2tmBdNI5jkWnrha-IFtYzSptGtZHN0c6zdp-FrDLmY7TCmOF00jDBOtGDAJ4oeKZeGnFNozT5N_6aDIWB-_ZmjPzP5M3_-DGE_vh9kdw</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Wang, Liang</creator><creator>Dai, Dawei</creator><creator>Fu, Shiyu</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8431-4431</orcidid></search><sort><creationdate>202501</creationdate><title>Multi-granularity representation learning for sketch-based dynamic face image retrieval</title><author>Wang, Liang ; Dai, Dawei ; Fu, Shiyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-ca04fadd4d022176d280696640acdd7ef873d5bb536496d5d81d52a322bb9f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Image retrieval</topic><topic>Learning</topic><topic>Representations</topic><topic>Retrieval</topic><topic>Sketches</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Dai, Dawei</creatorcontrib><creatorcontrib>Fu, Shiyu</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Liang</au><au>Dai, Dawei</au><au>Fu, Shiyu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-granularity representation learning for sketch-based dynamic face image retrieval</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><date>2025-01</date><risdate>2025</risdate><volume>55</volume><issue>1</issue><spage>54</spage><pages>54-</pages><artnum>54</artnum><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In some specific scenarios, a face sketch can be used to identify a person. However, drawing a face sketch often requires excellent skills and is time-consuming, which seriously hinders its widespread in the actual scenarios. The new framework of sketch less face image retrieval (SLFIR) (Dai et al. 2023) explores to provide some form of interaction between humans and machines during the drawing process to break the above barriers. Considering SLFIR framework, there is big gap between the partial sketch with few strokes and any one whole face photo, resulting in poor performance at the early stage. In this study, we proposed a multi-granularity (MG) representation learning (MGRL) to address the SLFIR problem, in which we learn the representation for the different granularity regions for the partial sketch and its target image. Specifically, (1) a classical triplet network was first adopted to learn the joint embedding space shared between the complete sketch and its target face photo; (2) Then, we divided the partial sketch in the sketch drawing episode into MG regions; Another learnable branch in the triplet network was designed to optimize the representation of the multi-granularity regions; Finally, by combining all the MG regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval performance on two publicly accessible datasets. Codes are available at https://github.com/ddw2AIGROUP2CQUPT/MGRL</abstract><cop>Boston</cop><pub>Springer Nature B.V</pub><doi>10.1007/s10489-024-05893-1</doi><orcidid>https://orcid.org/0000-0002-8431-4431</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2025-01, Vol.55 (1), p.54, Article 54 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_3134195304 |
source | SpringerLink Journals - AutoHoldings |
subjects | Image retrieval Learning Representations Retrieval Sketches |
title | Multi-granularity representation learning for sketch-based dynamic face image retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T19%3A13%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-granularity%20representation%20learning%20for%20sketch-based%20dynamic%20face%20image%20retrieval&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Wang,%20Liang&rft.date=2025-01&rft.volume=55&rft.issue=1&rft.spage=54&rft.pages=54-&rft.artnum=54&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-05893-1&rft_dat=%3Cproquest_cross%3E3134195304%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3134195304&rft_id=info:pmid/&rfr_iscdi=true |