Multi-output Headed Ensembles for Product Item Classification
In this paper, we revisit the problem of product item classification for large-scale e-commerce catalogs. The taxonomy of e-commerce catalogs consists of thousands of genres to which are assigned items that are uploaded by merchants on a continuous basis. The genre assignments by merchants are often...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we revisit the problem of product item classification for
large-scale e-commerce catalogs. The taxonomy of e-commerce catalogs consists
of thousands of genres to which are assigned items that are uploaded by
merchants on a continuous basis. The genre assignments by merchants are often
wrong but treated as ground truth labels in automatically generated training
sets, thus creating a feedback loop that leads to poorer model quality over
time. This problem of taxonomy classification becomes highly pronounced due to
the unavailability of sizable curated training sets.
Under such a scenario it is common to combine multiple classifiers to combat
poor generalization performance from a single classifier. We propose an
extensible deep learning based classification model framework that benefits
from the simplicity and robustness of averaging ensembles and fusion based
classifiers. We are also able to use metadata features and low-level feature
engineering to boost classification performance. We show these improvements
against robust industry standard baseline models that employ hyperparameter
optimization.
Additionally, due to continuous insertion, deletion and updates to real-world
high-volume e-commerce catalogs, assessing model performance for deployment
using A/B testing and/or manual annotation becomes a bottleneck. To this end,
we also propose a novel way to evaluate model performance using user sessions
that provides better insights in addition to traditional measures of precision
and recall. |
---|---|
DOI: | 10.48550/arxiv.2307.15858 |