Applying Heuristic and Machine Learning Strategies to Product Resolution

In order to analyze product data obtained from different web shops a process is needed to determine which product descriptions refer to the same product (product resolution). Based on string similarity metrics and existing product resolution approaches a new approach is presented with the following...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Strauß, Oliver, Almheidat, Ahmad, Kett, Holger
Format: Tagungsbericht
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In order to analyze product data obtained from different web shops a process is needed to determine which product descriptions refer to the same product (product resolution). Based on string similarity metrics and existing product resolution approaches a new approach is presented with the following components: a) extraction of information from the unstructured product title extracted from the e-shops, b) inclusion of additional information in the matching process, c) a method to compute a product similarity metric from the available data, d) optimization and adaption of model parameters to the characteristics of the underlying data via a genetic algorithm and e) a framework to automatically evaluate the matching method on the basis of realistic test data. The approach achieved a precision of 0.946 and a recall of 0.673.
DOI:10.5220/0008069402420249