Block-SCL: Blocking Matters for Supervised Contrastive Learning in Product Matching
Product matching is a fundamental step for the global understanding of consumer behavior in e-commerce. In practice, product matching refers to the task of deciding if two product offers from different data sources (e.g. retailers) represent the same product. Standard pipelines use a previous stage...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Product matching is a fundamental step for the global understanding of
consumer behavior in e-commerce. In practice, product matching refers to the
task of deciding if two product offers from different data sources (e.g.
retailers) represent the same product. Standard pipelines use a previous stage
called blocking, where for a given product offer a set of potential matching
candidates are retrieved based on similar characteristics (e.g. same brand,
category, flavor, etc.). From these similar product candidates, those that are
not a match can be considered hard negatives. We present Block-SCL, a strategy
that uses the blocking output to make the most of Supervised Contrastive
Learning (SCL). Concretely, Block-SCL builds enriched batches using the
hard-negatives samples obtained in the blocking stage. These batches provide a
strong training signal leading the model to learn more meaningful sentence
embeddings for product matching. Experimental results in several public
datasets demonstrate that Block-SCL achieves state-of-the-art results despite
only using short product titles as input, no data augmentation, and a lighter
transformer backbone than competing methods. |
---|---|
DOI: | 10.48550/arxiv.2207.02008 |