An Encoder-Decoder Neural Network for Indefinite Length Digit Sequences in Natural Scene Recognition
Extracting text information from raw images has always been a hot and difficult problem in computer vision research. Due to blurred image, uneven illumination and complex backgrounds, etc., the recognition of natural scene digit is difficult to achieve desired results. In this paper, an encoder-deco...
Gespeichert in:
Veröffentlicht in: | Journal of physics. Conference series 2019-11, Vol.1345 (2), p.22025 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Extracting text information from raw images has always been a hot and difficult problem in computer vision research. Due to blurred image, uneven illumination and complex backgrounds, etc., the recognition of natural scene digit is difficult to achieve desired results. In this paper, an encoder-decoder neural network model is proposed to solve the problem of recognition of indefinite length digit sequences in natural scene. The encoder is convolutional neural network (CNN), and the decoder is long short-term memory (LSTM). The encoder accepts a fixed-format image and outputs a fixed-length feature vector; the decoder accepts the feature vector and outputs a predictive sequence of indefinite length. The verification based on Google Street View house number Dataset (SVHN) shows that our method has a good performance. After 10 hours of training, whose accuracy is 96.57%. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/1345/2/022025 |