An end-to-end network for irregular printed Mongolian recognition
Mongolian is a language spoken in Inner Mongolia, China. In the recognition process, due to the shooting angle and other reasons, the image and text will be deformed, which will cause certain difficulties in recognition. This paper propose a triplet attention Mogrifier network (TAMN) for print Mongo...
Gespeichert in:
Veröffentlicht in: | International journal on document analysis and recognition 2022-03, Vol.25 (1), p.41-50 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Mongolian is a language spoken in Inner Mongolia, China. In the recognition process, due to the shooting angle and other reasons, the image and text will be deformed, which will cause certain difficulties in recognition. This paper propose a triplet attention Mogrifier network (TAMN) for print Mongolian text recognition. The network uses a spatial transformation network to correct deformed Mongolian images. It uses gated recurrent convolution layers (GRCL) combine with triplet attention module to extract image features for the corrected images. The Mogrifier long short-term memory (LSTM) network gets the context sequence information in the feature and finally uses the decoder’s LSTM attention to get the prediction result. Experimental results show the spatial transformation network can effectively recognize deformed Mongolian images, and the recognition accuracy can reach 90.30%. This network achieves good performance in Mongolian text recognition compare with the current mainstream text recognition network. The dataset has been publicly available at
https://github.com/ShaoDonCui/Mongolian-recognition
. |
---|---|
ISSN: | 1433-2833 1433-2825 |
DOI: | 10.1007/s10032-021-00388-y |