Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network
Hierarchical and complex Mathematical Expression Recognition (MER) is challenging due to multiple possible interpretations of a formula, complicating both parsing and evaluation. In this paper, we introduce the Hierarchical Detail-Focused Recognition dataset (HDR), the first dataset specifically des...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hierarchical and complex Mathematical Expression Recognition (MER) is
challenging due to multiple possible interpretations of a formula, complicating
both parsing and evaluation. In this paper, we introduce the Hierarchical
Detail-Focused Recognition dataset (HDR), the first dataset specifically
designed to address these issues. It consists of a large-scale training set,
HDR-100M, offering an unprecedented scale and diversity with one hundred
million training instances. And the test set, HDR-Test, includes multiple
interpretations of complex hierarchical formulas for comprehensive model
performance evaluation. Additionally, the parsing of complex formulas often
suffers from errors in fine-grained details. To address this, we propose the
Hierarchical Detail-Focused Recognition Network (HDNet), an innovative
framework that incorporates a hierarchical sub-formula module, focusing on the
precise handling of formula details, thereby significantly enhancing MER
performance. Experimental results demonstrate that HDNet outperforms existing
MER models across various datasets. |
---|---|
DOI: | 10.48550/arxiv.2409.11677 |