ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction
Click-through rate (CTR) prediction has become increasingly indispensable for various Internet applications. Traditional CTR models convert the multi-field categorical data into ID features via one-hot encoding, and extract the collaborative signals among features. Such a paradigm suffers from the p...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Click-through rate (CTR) prediction has become increasingly indispensable for
various Internet applications. Traditional CTR models convert the multi-field
categorical data into ID features via one-hot encoding, and extract the
collaborative signals among features. Such a paradigm suffers from the problem
of semantic information loss. Another line of research explores the potential
of pretrained language models (PLMs) for CTR prediction by converting input
data into textual sentences through hard prompt templates. Although semantic
signals are preserved, they generally fail to capture the collaborative
information (e.g., feature interactions, pure ID features), not to mention the
unacceptable inference overhead brought by the huge model size. In this paper,
we aim to model both the semantic knowledge and collaborative knowledge for
accurate CTR estimation, and meanwhile address the inference inefficiency
issue. To benefit from both worlds and close their gaps, we propose a novel
model-agnostic framework (i.e., ClickPrompt), where we incorporate CTR models
to generate interaction-aware soft prompts for PLMs. We design a
prompt-augmented masked language modeling (PA-MLM) pretraining task, where PLM
has to recover the masked tokens based on the language context, as well as the
soft prompts generated by CTR model. The collaborative and semantic knowledge
from ID and textual features would be explicitly aligned and interacted via the
prompt interface. Then, we can either tune the CTR model with PLM for superior
performance, or solely tune the CTR model without PLM for inference efficiency.
Experiments on four real-world datasets validate the effectiveness of
ClickPrompt compared with existing baselines. |
---|---|
DOI: | 10.48550/arxiv.2310.09234 |