A Monte Carlo EM Approach for Partially Observable Diffusion Processes: Theory and Applications to Neural Networks

We present a Monte Carlo approach for training partially observable diffusion processes. We apply the approach to diffusion networks, a stochastic version of continuous recurrent neural networks. The approach is aimed at learning probability distributions of continuous paths, not just expected value...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural computation 2002-07, Vol.14 (7), p.1507-1544
Hauptverfasser:	Movellan, Javier R., Mineiro, Paul, Williams, R. J.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Artificial Intelligence Biological and medical sciences Computer science control theory systems Diffusion Exact sciences and technology Fundamental and applied biological sciences. Psychology General aspects Humans Lattice theory and statistics (ising, potts, etc.) Learning and adaptive systems Lipreading Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Monte Carlo Method Neural Networks (Computer) Pattern Recognition, Visual Physics Speech Statistical physics, thermodynamics, and nonlinear dynamical systems Stochastic Processes
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We present a Monte Carlo approach for training partially observable diffusion processes. We apply the approach to diffusion networks, a stochastic version of continuous recurrent neural networks. The approach is aimed at learning probability distributions of continuous paths, not just expected values. Interestingly, the relevant activation statistics used by the learning rule presented here are inner products in the Hilbert space of square integrable functions. These inner products can be computed using Hebbian operations and do not require backpropagation of error signals. Moreover, standard kernel methods could potentially be applied to compute such inner products. We propose that the main reason that recurrent neural networks have not worked well in engineering applications (e.g., speech recognition) is that they implicitly rely on a very simplistic likelihood model. The diffusion network approach proposed here is much richer and may open new avenues for applications of recurrent neural networks. We present some analysis and simulations to support this view. Very encouraging results were obtained on a visual speech recognition task in which neural networks outperformed hidden Markov models.
ISSN:	0899-7667 1530-888X
DOI:	10.1162/08997660260028593