A Semi-supervised Multi-channel Graph Convolutional Network for Query Classification in E-commerce
Query intent classification is an essential module for customers to find desired products on the e-commerce application quickly. Most existing query intent classification methods rely on the users' click behavior as a supervised signal to construct training samples. However, these methods based...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Query intent classification is an essential module for customers to find
desired products on the e-commerce application quickly. Most existing query
intent classification methods rely on the users' click behavior as a supervised
signal to construct training samples. However, these methods based entirely on
posterior labels may lead to serious category imbalance problems because of the
Matthew effect in click samples. Compared with popular categories, it is
difficult for products under long-tail categories to obtain traffic and user
clicks, which makes the models unable to detect users' intent for products
under long-tail categories. This in turn aggravates the problem that long-tail
categories cannot obtain traffic, forming a vicious circle. In addition, due to
the randomness of the user's click, the posterior label is unstable for the
query with similar semantics, which makes the model very sensitive to the
input, leading to an unstable and incomplete recall of categories.
In this paper, we propose a novel Semi-supervised Multi-channel Graph
Convolutional Network (SMGCN) to address the above problems from the
perspective of label association and semi-supervised learning. SMGCN extends
category information and enhances the posterior label by utilizing the
similarity score between the query and categories. Furthermore, it leverages
the co-occurrence and semantic similarity graph of categories to strengthen the
relations among labels and weaken the influence of posterior label instability.
We conduct extensive offline and online A/B experiments, and the experimental
results show that SMGCN significantly outperforms the strong baselines, which
shows its effectiveness and practicality. |
---|---|
DOI: | 10.48550/arxiv.2408.01928 |