A Page Object Detection Method Based on Mask R-CNN

Page object detection is crucial for document understanding. Different granularities for objects can result in different performances. In this study, block level region object detection is considered among the inherent hierarchical structure for document images. Inspired by Mask R-CNN (Region-based...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.143448-143457
Hauptverfasser: Xu, Canhui, Shi, Cao, Bi, Hengyue, Liu, Chuanqi, Yuan, Yongfeng, Guo, Haoyan, Chen, Yinong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Page object detection is crucial for document understanding. Different granularities for objects can result in different performances. In this study, block level region object detection is considered among the inherent hierarchical structure for document images. Inspired by Mask R-CNN (Region-based Convolutional Neural Networks) method, an end to end network is proposed to perform object classification, bounding box identification, and page object mask generation at the same time. Latex based synthetic document generation is designed for enlarging the training data. A large number of synthetic page images are generated for training to alleviate the insufficient dataset problem. Compared with existing page object competition methods, the proposed method achieves better results, with mAP of 0.917 on page objects such as table, figure and maths detection.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3121152