Adversarial Attacks on Online Learning to Rank with Click Feedback
Online learning to rank (OLTR) is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. Although potential attacks against OLTR algorithms may cause serious losses in real-world applications, little is known about adve...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Online learning to rank (OLTR) is a sequential decision-making problem where
a learning agent selects an ordered list of items and receives feedback through
user clicks. Although potential attacks against OLTR algorithms may cause
serious losses in real-world applications, little is known about adversarial
attacks on OLTR. This paper studies attack strategies against multiple variants
of OLTR. Our first result provides an attack strategy against the UCB algorithm
on classical stochastic bandits with binary feedback, which solves the key
issues caused by bounded and discrete feedback that previous works can not
handle. Building on this result, we design attack algorithms against UCB-based
OLTR algorithms in position-based and cascade models. Finally, we propose a
general attack strategy against any algorithm under the general click model.
Each attack algorithm manipulates the learning agent into choosing the target
attack item $T-o(T)$ times, incurring a cumulative cost of $o(T)$. Experiments
on synthetic and real data further validate the effectiveness of our proposed
attack algorithms. |
---|---|
DOI: | 10.48550/arxiv.2305.17071 |