Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs
ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studi...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Liu, Zheng Zhang, Wei Chen, Yan Sun, Weiyi Du, Tianchuan Schroeder, Benjamin |
description | ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product
search and the learned semantic space(s) for query and product encoding are
expected to generalize to unseen queries or products. Yet, whether
generalization can conveniently emerge has not been thoroughly studied in the
domain thus far. In this paper, we examine several general-domain and
domain-specific pre-trained Roberta variants and discover that general-domain
fine-tuning does not help generalization, which aligns with the discovery of
prior art. Proper domain-specific fine-tuning with clickstream data can lead to
better model generalization, based on a bucketed analysis of a publicly
available manual annotated query-product pair da |
doi_str_mv | 10.48550/arxiv.2204.05231 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2204_05231</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2204_05231</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-669b6a43a4a1a24af3bacfc95726c30cd7e47a19e78273cc031c9c5b129244363</originalsourceid><addsrcrecordid>eNo1j81KxDAUhbNxIaMP4Mq8QGv-mkyXUnQUCg7Yfbm5TcdgmkpaderTW0ddHc7h48BHyBVnudoWBbuBdPQfuRBM5awQkp8T24yfkLqJ7lx0CYL_AhscfXYDxNkj3aexe8d5HSDhC7ULbdxxrX7wAZKfl5Vw2ZzARx8PdIz_ZBU8vtJ6PEwX5KyHMLnLv9yQ5v6uqR6y-mn3WN3WGWjDM61Lq0FJUMBBKOilBeyxLIzQKBl2xikDvHRmK4xEZJJjiYXlohRKSS035Pr39iTZviU_QFraH9n2JCu_AYIaT5o</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><source>arXiv.org</source><creator>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</creator><creatorcontrib>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</creatorcontrib><description>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product
search and the learned semantic space(s) for query and product encoding are
expected to generalize to unseen queries or products. Yet, whether
generalization can conveniently emerge has not been thoroughly studied in the
domain thus far. In this paper, we examine several general-domain and
domain-specific pre-trained Roberta variants and discover that general-domain
fine-tuning does not help generalization, which aligns with the discovery of
prior art. Proper domain-specific fine-tuning with clickstream data can lead to
better model generalization, based on a bucketed analysis of a publicly
available manual annotated query-product pair da</description><identifier>DOI: 10.48550/arxiv.2204.05231</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Information Retrieval ; Computer Science - Learning</subject><creationdate>2022-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2204.05231$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2204.05231$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Zheng</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Chen, Yan</creatorcontrib><creatorcontrib>Sun, Weiyi</creatorcontrib><creatorcontrib>Du, Tianchuan</creatorcontrib><creatorcontrib>Schroeder, Benjamin</creatorcontrib><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><description>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product
search and the learned semantic space(s) for query and product encoding are
expected to generalize to unseen queries or products. Yet, whether
generalization can conveniently emerge has not been thoroughly studied in the
domain thus far. In this paper, we examine several general-domain and
domain-specific pre-trained Roberta variants and discover that general-domain
fine-tuning does not help generalization, which aligns with the discovery of
prior art. Proper domain-specific fine-tuning with clickstream data can lead to
better model generalization, based on a bucketed analysis of a publicly
available manual annotated query-product pair da</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Information Retrieval</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j81KxDAUhbNxIaMP4Mq8QGv-mkyXUnQUCg7Yfbm5TcdgmkpaderTW0ddHc7h48BHyBVnudoWBbuBdPQfuRBM5awQkp8T24yfkLqJ7lx0CYL_AhscfXYDxNkj3aexe8d5HSDhC7ULbdxxrX7wAZKfl5Vw2ZzARx8PdIz_ZBU8vtJ6PEwX5KyHMLnLv9yQ5v6uqR6y-mn3WN3WGWjDM61Lq0FJUMBBKOilBeyxLIzQKBl2xikDvHRmK4xEZJJjiYXlohRKSS035Pr39iTZviU_QFraH9n2JCu_AYIaT5o</recordid><startdate>20220411</startdate><enddate>20220411</enddate><creator>Liu, Zheng</creator><creator>Zhang, Wei</creator><creator>Chen, Yan</creator><creator>Sun, Weiyi</creator><creator>Du, Tianchuan</creator><creator>Schroeder, Benjamin</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220411</creationdate><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><author>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-669b6a43a4a1a24af3bacfc95726c30cd7e47a19e78273cc031c9c5b129244363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Information Retrieval</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Zheng</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Chen, Yan</creatorcontrib><creatorcontrib>Sun, Weiyi</creatorcontrib><creatorcontrib>Du, Tianchuan</creatorcontrib><creatorcontrib>Schroeder, Benjamin</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Zheng</au><au>Zhang, Wei</au><au>Chen, Yan</au><au>Sun, Weiyi</au><au>Du, Tianchuan</au><au>Schroeder, Benjamin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</atitle><date>2022-04-11</date><risdate>2022</risdate><abstract>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product
search and the learned semantic space(s) for query and product encoding are
expected to generalize to unseen queries or products. Yet, whether
generalization can conveniently emerge has not been thoroughly studied in the
domain thus far. In this paper, we examine several general-domain and
domain-specific pre-trained Roberta variants and discover that general-domain
fine-tuning does not help generalization, which aligns with the discovery of
prior art. Proper domain-specific fine-tuning with clickstream data can lead to
better model generalization, based on a bucketed analysis of a publicly
available manual annotated query-product pair da</abstract><doi>10.48550/arxiv.2204.05231</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2204.05231 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2204_05231 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Information Retrieval Computer Science - Learning |
title | Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T17%3A00%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Generalizable%20Semantic%20Product%20Search%20by%20Text%20Similarity%20Pre-training%20on%20Search%20Click%20Logs&rft.au=Liu,%20Zheng&rft.date=2022-04-11&rft_id=info:doi/10.48550/arxiv.2204.05231&rft_dat=%3Carxiv_GOX%3E2204_05231%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |