Syntax-Aware Language Modeling with Recurrent Neural Networks
Neural language models (LMs) are typically trained using only lexical features, such as surface forms of words. In this paper, we argue this deprives the LM of crucial syntactic signals that can be detected at high confidence using existing parsers. We present a simple but highly effective approach...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural language models (LMs) are typically trained using only lexical
features, such as surface forms of words. In this paper, we argue this deprives
the LM of crucial syntactic signals that can be detected at high confidence
using existing parsers. We present a simple but highly effective approach for
training neural LMs using both lexical and syntactic information, and a novel
approach for applying such LMs to unparsed text using sequential Monte Carlo
sampling. In experiments on a range of corpora and corpus sizes, we show our
approach consistently outperforms standard lexical LMs in character-level
language modeling; on the other hand, for word-level models the models are on a
par with standard language models. These results indicate potential for
expanding LMs beyond lexical surface features to higher-level NLP features for
character-level models. |
---|---|
DOI: | 10.48550/arxiv.1803.03665 |