MinerU: An Open-Source Solution for Precise Document Content Extraction

Document content analysis has been a crucial research area in computer vision. Despite significant advancements in methods such as OCR, layout detection, and formula recognition, existing open-source solutions struggle to consistently deliver high-quality content extraction due to the diversity in d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-09
Hauptverfasser: Wang, Bin, Xu, Chao, Zhao, Xiaomeng, Linke Ouyang, Wu, Fan, Zhao, Zhiyuan, Xu, Rui, Liu, Kaiwen, Qu, Yuan, Fukai Shang, Zhang, Bo, Wei, Liqun, Sui, Zhihao, Li, Wei, Shi, Botian, Yu, Qiao, Lin, Dahua, He, Conghui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!