Learning label-label correlations in Extreme Multi-label Classification via Label Features
Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent works in this domain have increasingly focused on a symmetric problem setting where both input instances and label featur...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Extreme Multi-label Text Classification (XMC) involves learning a classifier
that can assign an input with a subset of most relevant labels from millions of
label choices. Recent works in this domain have increasingly focused on a
symmetric problem setting where both input instances and label features are
short-text in nature. Short-text XMC with label features has found numerous
applications in areas such as query-to-ad-phrase matching in search ads,
title-based product recommendation, prediction of related searches. In this
paper, we propose Gandalf, a novel approach which makes use of a label
co-occurrence graph to leverage label features as additional data points to
supplement the training distribution. By exploiting the characteristics of the
short-text XMC problem, it leverages the label features to construct valid
training instances, and uses the label graph for generating the corresponding
soft-label targets, hence effectively capturing the label-label correlations.
Surprisingly, models trained on these new training instances, although being
less than half of the original dataset, can outperform models trained on the
original dataset, particularly on the PSP@k metric for tail labels. With this
insight, we aim to train existing XMC algorithms on both, the original and new
training instances, leading to an average 5% relative improvements for 6
state-of-the-art algorithms across 4 benchmark datasets consisting of up to
1.3M labels. Gandalf can be applied in a plug-and-play manner to various
methods and thus forwards the state-of-the-art in the domain, without incurring
any additional computational overheads. |
---|---|
DOI: | 10.48550/arxiv.2405.04545 |