Regularization, robustness and sparsity of probabilistic topic models

We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Kompʹûternye issledovaniâ i modelirovanie (Online) 2012-12, Vol.4 (4), p.693-706
Hauptverfasser: Vorontsov, Konstantin Vyacheslavovich, Potapenko, Anna Alexandrovna
Format: Artikel
Sprache:eng ; rus
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the proposed broad family of models. We propose the robust PLSA model and show that it is more sparse and performs better that regularized models like LDA.
ISSN:2076-7633
2077-6853
DOI:10.20537/2076-7633-2012-4-4-693-706