Reply with Sticker: New Dataset and Model for Sticker Retrieval

Using stickers in online chatting is very prevalent on social media platforms, where the stickers used in the conversation can express someone's intention/emotion/attitude in a vivid, tactful, and intuitive way. Existing sticker retrieval research typically retrieves stickers based on context a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-07
Hauptverfasser: Liang, Bin, Wang, Bingbing, Bai, Zhixin, Lang, Qiwei, Sun, Mingwei, Hou, Kaiheng, Zhou, Lanjun, Xu, Ruifeng, Wong, Kam-Fai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Using stickers in online chatting is very prevalent on social media platforms, where the stickers used in the conversation can express someone's intention/emotion/attitude in a vivid, tactful, and intuitive way. Existing sticker retrieval research typically retrieves stickers based on context and the current utterance delivered by the user. That is, the stickers serve as a supplement to the current utterance. However, in the real-world scenario, using stickers to express what we want to say rather than as a supplement to our words only is also important. Therefore, in this paper, we create a new dataset for sticker retrieval in conversation, called \textbf{StickerInt}, where stickers are used to reply to previous conversations or supplement our words\footnote{We believe that the release of this dataset will provide a more complete paradigm than existing work for the research of sticker retrieval in the open-domain online conversation.}. Based on the created dataset, we present a simple yet effective framework for sticker retrieval in conversation based on the learning of intention and the cross-modal relationships between conversation context and stickers, coined as \textbf{Int-RA}. Specifically, we first devise a knowledge-enhanced intention predictor to introduce the intention information into the conversation representations. Subsequently, a relation-aware sticker selector is devised to retrieve the response sticker via cross-modal relationships. Extensive experiments on the created dataset show that the proposed model achieves state-of-the-art performance in sticker retrieval\footnote{The dataset and source code of this work are released at \url{https://github.com/HITSZ-HLT/Int-RA}.}.
ISSN:2331-8422