Embedded Topic Models Enhanced by Wikification
Topic modeling analyzes a collection of documents to learn meaningful patterns of words. However, previous topic models consider only the spelling of words and do not take into consideration the homography of words. In this study, we incorporate the Wikipedia knowledge into a neural topic model to m...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Topic modeling analyzes a collection of documents to learn meaningful
patterns of words. However, previous topic models consider only the spelling of
words and do not take into consideration the homography of words. In this
study, we incorporate the Wikipedia knowledge into a neural topic model to make
it aware of named entities. We evaluate our method on two datasets, 1) news
articles of \textit{New York Times} and 2) the AIDA-CoNLL dataset. Our
experiments show that our method improves the performance of neural topic
models in generalizability. Moreover, we analyze frequent terms in each topic
and the temporal dependencies between topics to demonstrate that our
entity-aware topic models can capture the time-series development of topics
well. |
---|---|
DOI: | 10.48550/arxiv.2410.02441 |