LORE: Logical Location Regression Network for Table Structure Recognition
Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes, or learning to generate the corresponding markup sequences from the table images. However, they e...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Table structure recognition (TSR) aims at extracting tables in images into
machine-understandable formats. Recent methods solve this problem by predicting
the adjacency relations of detected cell boxes, or learning to generate the
corresponding markup sequences from the table images. However, they either
count on additional heuristic rules to recover the table structures, or require
a huge amount of training data and time-consuming sequential decoders. In this
paper, we propose an alternative paradigm. We model TSR as a logical location
regression problem and propose a new TSR framework called LORE, standing for
LOgical location REgression network, which for the first time combines logical
location regression together with spatial location regression of table cells.
Our proposed LORE is conceptually simpler, easier to train and more accurate
than previous TSR models of other paradigms. Experiments on standard benchmarks
demonstrate that LORE consistently outperforms prior arts. Code is available at
https://
github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/DocumentUnderstanding/LORE-TSR. |
---|---|
DOI: | 10.48550/arxiv.2303.03730 |