Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Liu, Zheng, Zhang, Wei, Chen, Yan, Sun, Weiyi, Du, Tianchuan, Schroeder, Benjamin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Liu, Zheng
Zhang, Wei
Chen, Yan
Sun, Weiyi
Du, Tianchuan
Schroeder, Benjamin
description ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da
doi_str_mv 10.48550/arxiv.2204.05231
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2204_05231</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2204_05231</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-669b6a43a4a1a24af3bacfc95726c30cd7e47a19e78273cc031c9c5b129244363</originalsourceid><addsrcrecordid>eNo1j81KxDAUhbNxIaMP4Mq8QGv-mkyXUnQUCg7Yfbm5TcdgmkpaderTW0ddHc7h48BHyBVnudoWBbuBdPQfuRBM5awQkp8T24yfkLqJ7lx0CYL_AhscfXYDxNkj3aexe8d5HSDhC7ULbdxxrX7wAZKfl5Vw2ZzARx8PdIz_ZBU8vtJ6PEwX5KyHMLnLv9yQ5v6uqR6y-mn3WN3WGWjDM61Lq0FJUMBBKOilBeyxLIzQKBl2xikDvHRmK4xEZJJjiYXlohRKSS035Pr39iTZviU_QFraH9n2JCu_AYIaT5o</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><source>arXiv.org</source><creator>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</creator><creatorcontrib>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</creatorcontrib><description>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da</description><identifier>DOI: 10.48550/arxiv.2204.05231</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Information Retrieval ; Computer Science - Learning</subject><creationdate>2022-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2204.05231$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2204.05231$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Zheng</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Chen, Yan</creatorcontrib><creatorcontrib>Sun, Weiyi</creatorcontrib><creatorcontrib>Du, Tianchuan</creatorcontrib><creatorcontrib>Schroeder, Benjamin</creatorcontrib><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><description>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Information Retrieval</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo1j81KxDAUhbNxIaMP4Mq8QGv-mkyXUnQUCg7Yfbm5TcdgmkpaderTW0ddHc7h48BHyBVnudoWBbuBdPQfuRBM5awQkp8T24yfkLqJ7lx0CYL_AhscfXYDxNkj3aexe8d5HSDhC7ULbdxxrX7wAZKfl5Vw2ZzARx8PdIz_ZBU8vtJ6PEwX5KyHMLnLv9yQ5v6uqR6y-mn3WN3WGWjDM61Lq0FJUMBBKOilBeyxLIzQKBl2xikDvHRmK4xEZJJjiYXlohRKSS035Pr39iTZviU_QFraH9n2JCu_AYIaT5o</recordid><startdate>20220411</startdate><enddate>20220411</enddate><creator>Liu, Zheng</creator><creator>Zhang, Wei</creator><creator>Chen, Yan</creator><creator>Sun, Weiyi</creator><creator>Du, Tianchuan</creator><creator>Schroeder, Benjamin</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220411</creationdate><title>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</title><author>Liu, Zheng ; Zhang, Wei ; Chen, Yan ; Sun, Weiyi ; Du, Tianchuan ; Schroeder, Benjamin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-669b6a43a4a1a24af3bacfc95726c30cd7e47a19e78273cc031c9c5b129244363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Information Retrieval</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Zheng</creatorcontrib><creatorcontrib>Zhang, Wei</creatorcontrib><creatorcontrib>Chen, Yan</creatorcontrib><creatorcontrib>Sun, Weiyi</creatorcontrib><creatorcontrib>Du, Tianchuan</creatorcontrib><creatorcontrib>Schroeder, Benjamin</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Zheng</au><au>Zhang, Wei</au><au>Chen, Yan</au><au>Sun, Weiyi</au><au>Du, Tianchuan</au><au>Schroeder, Benjamin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs</atitle><date>2022-04-11</date><risdate>2022</risdate><abstract>ECNLP 2022 Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da</abstract><doi>10.48550/arxiv.2204.05231</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2204.05231
ispartof
issn
language eng
recordid cdi_arxiv_primary_2204_05231
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
Computer Science - Information Retrieval
Computer Science - Learning
title Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T17%3A00%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Generalizable%20Semantic%20Product%20Search%20by%20Text%20Similarity%20Pre-training%20on%20Search%20Click%20Logs&rft.au=Liu,%20Zheng&rft.date=2022-04-11&rft_id=info:doi/10.48550/arxiv.2204.05231&rft_dat=%3Carxiv_GOX%3E2204_05231%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true