Unsupervised Qualitative Scoring for Binary Item Features

Binary features, such as categories, keywords, or tags, are widely used to describe product properties. However, these features are incomplete in that they do not contain several aspects of numerical information. The qualitative score of tags is widely used to describe which product is better in ter...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Data science and engineering 2020-09, Vol.5 (3), p.317-330
Hauptverfasser:	Ichikawa, Koji, Tamano, Hiroshi
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithm Analysis and Problem Complexity Artificial Intelligence Chemistry and Earth Sciences Collaborative filtering Computer Science Data Mining and Knowledge Discovery Database Management Evaluation Fraud Label enhancement Physics Probabilistic methods Security Statistics for Engineering Systems and Data Security Threat evaluation Topic model Unsupervised learning User behavior
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Binary features, such as categories, keywords, or tags, are widely used to describe product properties. However, these features are incomplete in that they do not contain several aspects of numerical information. The qualitative score of tags is widely used to describe which product is better in terms of the given property. For example, in a restaurant navigation site, properties such as mood, dishes, and location are given in the form of numerical values, representing the goodness of each aspect. In this paper, we propose a novel approach to estimate the qualitative score from the binary features of products. Based on a natural assumption that an item with a better property is more popular among users who prefer that property, in short, “experts know best,” we introduce both discriminative and generative models with which user preferences and item qualitative scores are inferred from user--item interactions. We constrain the space of the item qualitative score by item binary features so that the score of each item and tag can only have nonzero values when the item has the corresponding tag. This approach contributes to resolving the following difficulties: (1) no supervised data for the score estimation, (2) implicit user purpose, and (3) irrelevant tag contamination. We evaluate our models by using two artificial datasets and two real-world datasets of movie and book ratings. In the experiment, we evaluate the performances of our model under sparse transaction and noisy tag settings by using two artificial datasets. We also evaluate our models’ resolution for irrelevant tags using the real-world dataset of movie ratings and observe that our models outperform a baseline model. Finally, tag rankings obtained from the real-world datasets are compared with a baseline model.
ISSN:	2364-1185 2364-1541
DOI:	10.1007/s41019-020-00129-x