A unified multilingual handwriting recognition system using multigrams sub-lexical units
•We present a unified multilingual system for handwriting recognition.•The system is based on a two step-level recognition system using an optical model and a language model.•We propose to model language modeling using sub-lexical units.•One investigates bilingual recognition on French and English d...
Gespeichert in:
Veröffentlicht in: | Pattern recognition letters 2019-04, Vol.121, p.68-76 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •We present a unified multilingual system for handwriting recognition.•The system is based on a two step-level recognition system using an optical model and a language model.•We propose to model language modeling using sub-lexical units.•One investigates bilingual recognition on French and English datasets.•Using multigrams units allows to build a multilingual system which benefits to the language combination.
We address the design of a unified multilingual system for handwriting recognition. Most of multilingual systems rests on specialized models that are trained on a single language and one of them is selected at test time. While some recognition systems are based on a unified optical model, dealing with a unified language model remains a major issue, as traditional language models are generally trained on corpora composed of large word lexicons per language. Here, we bring a solution by considering language models based on sub-lexical units, called multigrams. Dealing with multigrams strongly reduces the lexicon size and thus decreases the language model complexity. This makes possible the design of an end-to-end unified multilingual recognition system where both a single optical model and a single language model are trained on all the languages. We discuss the impact of the language unification on each model and show that our system reaches state-of-the-art methods performance with a strong reduction of the complexity. |
---|---|
ISSN: | 0167-8655 1872-7344 |
DOI: | 10.1016/j.patrec.2018.07.027 |