Interpretable Foreground Object Search As Knowledge Distillation
This paper proposes a knowledge distillation method for foreground object search (FoS). Given a background and a rectangle specifying the foreground location and scale, FoS retrieves compatible foregrounds in a certain category for later image composition. Foregrounds within the same category can be...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Li, Boren Zhuang, Po-Yu Gu, Jian Li, Mingyang Tan, Ping |
description | This paper proposes a knowledge distillation method for foreground object
search (FoS). Given a background and a rectangle specifying the foreground
location and scale, FoS retrieves compatible foregrounds in a certain category
for later image composition. Foregrounds within the same category can be
grouped into a small number of patterns. Instances within each pattern are
compatible with any query input interchangeably. These instances are referred
to as interchangeable foregrounds. We first present a pipeline to build
pattern-level FoS dataset containing labels of interchangeable foregrounds. We
then establish a benchmark dataset for further training and testing following
the pipeline. As for the proposed method, we first train a foreground encoder
to learn representations of interchangeable foregrounds. We then train a query
encoder to learn query-foreground compatibility following a knowledge
distillation framework. It aims to transfer knowledge from interchangeable
foregrounds to supervise representation learning of compatibility. The query
feature representation is projected to the same latent space as interchangeable
foregrounds, enabling very efficient and interpretable instance-level search.
Furthermore, pattern-level search is feasible to retrieve more controllable,
reasonable and diverse foregrounds. The proposed method outperforms the
previous state-of-the-art by 10.42% in absolute difference and 24.06% in
relative improvement evaluated by mean average precision (mAP). Extensive
experimental results also demonstrate its efficacy from various aspects. The
benchmark dataset and code will be release shortly. |
doi_str_mv | 10.48550/arxiv.2007.09867 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2007_09867</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2007_09867</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-1198157be08be8d423965a27e768357f92f72ba2fd5ba95eb54503eab43016db3</originalsourceid><addsrcrecordid>eNotz7tuwjAUgGEvHSroA3SqXyCpL_FtA9HSoiIxlD06xidg5CbIcW9vXwGd_u2XPkLuOasbqxR7hPwTv2rBmKmZs9rcktmqL5hPGQv4hHQ5ZNzn4bMPdOOPuCv0HSHvDnQ-0rd--E4Y9kif4lhiSlDi0E_JTQdpxLv_Tsh2-bxdvFbrzctqMV9XoI2pOHeWK-ORWY82NEI6rUAYNNpKZTonOiM8iC4oD06hV41iEsE3knEdvJyQh-v2QmhPOX5A_m3PlPZCkX8QU0Oe</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Interpretable Foreground Object Search As Knowledge Distillation</title><source>arXiv.org</source><creator>Li, Boren ; Zhuang, Po-Yu ; Gu, Jian ; Li, Mingyang ; Tan, Ping</creator><creatorcontrib>Li, Boren ; Zhuang, Po-Yu ; Gu, Jian ; Li, Mingyang ; Tan, Ping</creatorcontrib><description>This paper proposes a knowledge distillation method for foreground object
search (FoS). Given a background and a rectangle specifying the foreground
location and scale, FoS retrieves compatible foregrounds in a certain category
for later image composition. Foregrounds within the same category can be
grouped into a small number of patterns. Instances within each pattern are
compatible with any query input interchangeably. These instances are referred
to as interchangeable foregrounds. We first present a pipeline to build
pattern-level FoS dataset containing labels of interchangeable foregrounds. We
then establish a benchmark dataset for further training and testing following
the pipeline. As for the proposed method, we first train a foreground encoder
to learn representations of interchangeable foregrounds. We then train a query
encoder to learn query-foreground compatibility following a knowledge
distillation framework. It aims to transfer knowledge from interchangeable
foregrounds to supervise representation learning of compatibility. The query
feature representation is projected to the same latent space as interchangeable
foregrounds, enabling very efficient and interpretable instance-level search.
Furthermore, pattern-level search is feasible to retrieve more controllable,
reasonable and diverse foregrounds. The proposed method outperforms the
previous state-of-the-art by 10.42% in absolute difference and 24.06% in
relative improvement evaluated by mean average precision (mAP). Extensive
experimental results also demonstrate its efficacy from various aspects. The
benchmark dataset and code will be release shortly.</description><identifier>DOI: 10.48550/arxiv.2007.09867</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2020-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2007.09867$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2007.09867$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Boren</creatorcontrib><creatorcontrib>Zhuang, Po-Yu</creatorcontrib><creatorcontrib>Gu, Jian</creatorcontrib><creatorcontrib>Li, Mingyang</creatorcontrib><creatorcontrib>Tan, Ping</creatorcontrib><title>Interpretable Foreground Object Search As Knowledge Distillation</title><description>This paper proposes a knowledge distillation method for foreground object
search (FoS). Given a background and a rectangle specifying the foreground
location and scale, FoS retrieves compatible foregrounds in a certain category
for later image composition. Foregrounds within the same category can be
grouped into a small number of patterns. Instances within each pattern are
compatible with any query input interchangeably. These instances are referred
to as interchangeable foregrounds. We first present a pipeline to build
pattern-level FoS dataset containing labels of interchangeable foregrounds. We
then establish a benchmark dataset for further training and testing following
the pipeline. As for the proposed method, we first train a foreground encoder
to learn representations of interchangeable foregrounds. We then train a query
encoder to learn query-foreground compatibility following a knowledge
distillation framework. It aims to transfer knowledge from interchangeable
foregrounds to supervise representation learning of compatibility. The query
feature representation is projected to the same latent space as interchangeable
foregrounds, enabling very efficient and interpretable instance-level search.
Furthermore, pattern-level search is feasible to retrieve more controllable,
reasonable and diverse foregrounds. The proposed method outperforms the
previous state-of-the-art by 10.42% in absolute difference and 24.06% in
relative improvement evaluated by mean average precision (mAP). Extensive
experimental results also demonstrate its efficacy from various aspects. The
benchmark dataset and code will be release shortly.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz7tuwjAUgGEvHSroA3SqXyCpL_FtA9HSoiIxlD06xidg5CbIcW9vXwGd_u2XPkLuOasbqxR7hPwTv2rBmKmZs9rcktmqL5hPGQv4hHQ5ZNzn4bMPdOOPuCv0HSHvDnQ-0rd--E4Y9kif4lhiSlDi0E_JTQdpxLv_Tsh2-bxdvFbrzctqMV9XoI2pOHeWK-ORWY82NEI6rUAYNNpKZTonOiM8iC4oD06hV41iEsE3knEdvJyQh-v2QmhPOX5A_m3PlPZCkX8QU0Oe</recordid><startdate>20200719</startdate><enddate>20200719</enddate><creator>Li, Boren</creator><creator>Zhuang, Po-Yu</creator><creator>Gu, Jian</creator><creator>Li, Mingyang</creator><creator>Tan, Ping</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200719</creationdate><title>Interpretable Foreground Object Search As Knowledge Distillation</title><author>Li, Boren ; Zhuang, Po-Yu ; Gu, Jian ; Li, Mingyang ; Tan, Ping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-1198157be08be8d423965a27e768357f92f72ba2fd5ba95eb54503eab43016db3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Boren</creatorcontrib><creatorcontrib>Zhuang, Po-Yu</creatorcontrib><creatorcontrib>Gu, Jian</creatorcontrib><creatorcontrib>Li, Mingyang</creatorcontrib><creatorcontrib>Tan, Ping</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Boren</au><au>Zhuang, Po-Yu</au><au>Gu, Jian</au><au>Li, Mingyang</au><au>Tan, Ping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Interpretable Foreground Object Search As Knowledge Distillation</atitle><date>2020-07-19</date><risdate>2020</risdate><abstract>This paper proposes a knowledge distillation method for foreground object
search (FoS). Given a background and a rectangle specifying the foreground
location and scale, FoS retrieves compatible foregrounds in a certain category
for later image composition. Foregrounds within the same category can be
grouped into a small number of patterns. Instances within each pattern are
compatible with any query input interchangeably. These instances are referred
to as interchangeable foregrounds. We first present a pipeline to build
pattern-level FoS dataset containing labels of interchangeable foregrounds. We
then establish a benchmark dataset for further training and testing following
the pipeline. As for the proposed method, we first train a foreground encoder
to learn representations of interchangeable foregrounds. We then train a query
encoder to learn query-foreground compatibility following a knowledge
distillation framework. It aims to transfer knowledge from interchangeable
foregrounds to supervise representation learning of compatibility. The query
feature representation is projected to the same latent space as interchangeable
foregrounds, enabling very efficient and interpretable instance-level search.
Furthermore, pattern-level search is feasible to retrieve more controllable,
reasonable and diverse foregrounds. The proposed method outperforms the
previous state-of-the-art by 10.42% in absolute difference and 24.06% in
relative improvement evaluated by mean average precision (mAP). Extensive
experimental results also demonstrate its efficacy from various aspects. The
benchmark dataset and code will be release shortly.</abstract><doi>10.48550/arxiv.2007.09867</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2007.09867 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2007_09867 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Interpretable Foreground Object Search As Knowledge Distillation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T21%3A16%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Interpretable%20Foreground%20Object%20Search%20As%20Knowledge%20Distillation&rft.au=Li,%20Boren&rft.date=2020-07-19&rft_id=info:doi/10.48550/arxiv.2007.09867&rft_dat=%3Carxiv_GOX%3E2007_09867%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |