Strongly-Typed Recurrent Neural Networks
Recurrent neural networks are increasing popular models for sequential learning. Unfortunately, although the most effective RNN architectures are perhaps excessively complicated, extensive searches have not found simpler alternatives. This paper imports ideas from physics and functional programming...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recurrent neural networks are increasing popular models for sequential
learning. Unfortunately, although the most effective RNN architectures are
perhaps excessively complicated, extensive searches have not found simpler
alternatives. This paper imports ideas from physics and functional programming
into RNN design to provide guiding principles. From physics, we introduce type
constraints, analogous to the constraints that forbids adding meters to
seconds. From functional programming, we require that strongly-typed
architectures factorize into stateless learnware and state-dependent firmware,
reducing the impact of side-effects. The features learned by strongly-typed
nets have a simple semantic interpretation via dynamic average-pooling on
one-dimensional convolutions. We also show that strongly-typed gradients are
better behaved than in classical architectures, and characterize the
representational power of strongly-typed nets. Finally, experiments show that,
despite being more constrained, strongly-typed architectures achieve lower
training and comparable generalization error to classical architectures. |
---|---|
DOI: | 10.48550/arxiv.1602.02218 |