Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present a new table structure recognition (TSR) approach, called
TSRFormer, to robustly recognizing the structures of complex tables with
geometrical distortions from various table images. Unlike previous methods, we
formulate table separation line prediction as a line regression problem instead
of an image segmentation problem and propose a new two-stage dynamic queries
enhanced DETR based separation line regression approach, named DQ-DETR, to
predict separation lines from table images directly. Compared to Vallina DETR,
we propose three improvements in DQ-DETR to make the two-stage DETR framework
work efficiently and effectively for the separation line prediction task: 1) A
new query design, named Dynamic Query, to decouple single line query into
separable point queries which could intuitively improve the localization
accuracy for regression tasks; 2) A dynamic queries based progressive line
regression approach to progressively regressing points on the line which
further enhances localization accuracy for distorted tables; 3) A
prior-enhanced matching strategy to solve the slow convergence issue of DETR.
After separation line prediction, a simple relation network based cell merging
module is used to recover spanning cells. With these new techniques, our
TSRFormer achieves state-of-the-art performance on several benchmark datasets,
including SciTSR, PubTabNet, WTW and FinTabNet. Furthermore, we have validated
the robustness and high localization accuracy of our approach to tables with
complex structures, borderless cells, large blank spaces, empty or spanning
cells as well as distorted or even curved shapes on a more challenging
real-world in-house dataset. |
---|---|
DOI: | 10.48550/arxiv.2303.11615 |