Offline handwritten mathematical expression recognition based on YOLOv5s

The error accumulation in traditional offline handwritten mathematical expression recognition (OHMER) becomes challenging, because of the two-dimensional structure and writing arbitrariness of offline handwritten mathematical formulas. In this study, an OHMER method based on YOLOv5s was proposed. Fi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2024-03, Vol.40 (3), p.1439-1452
Hauptverfasser: Li, Fei, Fang, Hongbo, Wang, Dengzhun, Liu, Ruixin, Hou, Qing, Xie, Benliang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The error accumulation in traditional offline handwritten mathematical expression recognition (OHMER) becomes challenging, because of the two-dimensional structure and writing arbitrariness of offline handwritten mathematical formulas. In this study, an OHMER method based on YOLOv5s was proposed. First, YOLOv5s was used to recognize the symbol category and spatial location information of the expression image. Second, the spatial attention mechanism was introduced in YOLOv5s to enlarge the difference among symbol categories and improve accuracy. Then, a bidirectional long short-term memory network (BiLSTM) was introduced to give the symbols context-related information. Finally, the contextual relevance of the symbols was improved by increasing the number of BiLSTM layers, achieving an accuracy of 95.67%. A mathematical expressions relationship tree was built using the symbol recognition results. Clustering theory was used to analyze the two-dimensional structure of expressions. The recognition accuracy of expressions on the CROHME 2019 Test was 65.47%. The recognition rate of YOLOv5s_SB3CT is second only to that of PAL. However, the recognition rate of YOLOv5_SB3CT is higher than that of PAL when the error is less than three. This finding demonstrates that the proposed model is more fault-tolerant and stable than other models.
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-023-02859-1