Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network
Hierarchical topic models such as the gamma belief network (GBN) have delivered promising results in mining multi-layer document representations and discovering interpretable topic taxonomies. However, they often assume in the prior that the topics at each layer are independently drawn from the Diri...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hierarchical topic models such as the gamma belief network (GBN) have
delivered promising results in mining multi-layer document representations and
discovering interpretable topic taxonomies. However, they often assume in the
prior that the topics at each layer are independently drawn from the Dirichlet
distribution, ignoring the dependencies between the topics both at the same
layer and across different layers. To relax this assumption, we propose
sawtooth factorial topic embedding guided GBN, a deep generative model of
documents that captures the dependencies and semantic similarities between the
topics in the embedding space. Specifically, both the words and topics are
represented as embedding vectors of the same dimension. The topic matrix at a
layer is factorized into the product of a factor loading matrix and a topic
embedding matrix, the transpose of which is set as the factor loading matrix of
the layer above. Repeating this particular type of factorization, which shares
components between adjacent layers, leads to a structure referred to as
sawtooth factorization. An auto-encoding variational inference network is
constructed to optimize the model parameter via stochastic gradient descent.
Experiments on big corpora show that our models outperform other neural topic
models on extracting deeper interpretable topics and deriving better document
representations. |
---|---|
DOI: | 10.48550/arxiv.2107.02757 |