UnSupDLA: Towards Unsupervised Document Layout Analysis

Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of inte...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-06
Hauptverfasser:	Talha Uddin Sheikh, Shehzadi, Tahira, Khurram Azeem Hashmi, Stricker, Didier, Afzal, Muhammad Zeshan
Format:	Artikel
Sprache:	eng
Schlagworte:	Documents Image enhancement Labels Layouts Masks Object recognition Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Talha Uddin Sheikh Shehzadi, Tahira Khurram Azeem Hashmi Stricker, Didier Afzal, Muhammad Zeshan
description	Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from the unlabeled document images. These masks are then used to train a detector, enhancing object detection and segmentation performance. The model's effectiveness is further amplified through several unsupervised training iterations, continuously refining its performance. This approach significantly advances document layout analysis, particularly precision and efficiency, without labels.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3066576411</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3066576411</sourcerecordid><originalsourceid>FETCH-proquest_journals_30665764113</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwD80LLi1w8XG0UgjJL08sSilWCM0rLi1ILSrLLE5NUXDJTy7NTc0rUfBJrMwvLVFwzEvMqSzOLOZhYE1LzClO5YXS3AzKbq4hzh66BUX5haWpxSXxWfmlRUDFxfHGBmZmpuZmJoaGxsSpAgD2JDWi</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3066576411</pqid></control><display><type>article</type><title>UnSupDLA: Towards Unsupervised Document Layout Analysis</title><source>Free E- Journals</source><creator>Talha Uddin Sheikh ; Shehzadi, Tahira ; Khurram Azeem Hashmi ; Stricker, Didier ; Afzal, Muhammad Zeshan</creator><creatorcontrib>Talha Uddin Sheikh ; Shehzadi, Tahira ; Khurram Azeem Hashmi ; Stricker, Didier ; Afzal, Muhammad Zeshan</creatorcontrib><description>Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from the unlabeled document images. These masks are then used to train a detector, enhancing object detection and segmentation performance. The model's effectiveness is further amplified through several unsupervised training iterations, continuously refining its performance. This approach significantly advances document layout analysis, particularly precision and efficiency, without labels.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Documents ; Image enhancement ; Labels ; Layouts ; Masks ; Object recognition ; Training</subject><ispartof>arXiv.org, 2024-06</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Talha Uddin Sheikh</creatorcontrib><creatorcontrib>Shehzadi, Tahira</creatorcontrib><creatorcontrib>Khurram Azeem Hashmi</creatorcontrib><creatorcontrib>Stricker, Didier</creatorcontrib><creatorcontrib>Afzal, Muhammad Zeshan</creatorcontrib><title>UnSupDLA: Towards Unsupervised Document Layout Analysis</title><title>arXiv.org</title><description>Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from the unlabeled document images. These masks are then used to train a detector, enhancing object detection and segmentation performance. The model's effectiveness is further amplified through several unsupervised training iterations, continuously refining its performance. This approach significantly advances document layout analysis, particularly precision and efficiency, without labels.</description><subject>Documents</subject><subject>Image enhancement</subject><subject>Labels</subject><subject>Layouts</subject><subject>Masks</subject><subject>Object recognition</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwD80LLi1w8XG0UgjJL08sSilWCM0rLi1ILSrLLE5NUXDJTy7NTc0rUfBJrMwvLVFwzEvMqSzOLOZhYE1LzClO5YXS3AzKbq4hzh66BUX5haWpxSXxWfmlRUDFxfHGBmZmpuZmJoaGxsSpAgD2JDWi</recordid><startdate>20240610</startdate><enddate>20240610</enddate><creator>Talha Uddin Sheikh</creator><creator>Shehzadi, Tahira</creator><creator>Khurram Azeem Hashmi</creator><creator>Stricker, Didier</creator><creator>Afzal, Muhammad Zeshan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240610</creationdate><title>UnSupDLA: Towards Unsupervised Document Layout Analysis</title><author>Talha Uddin Sheikh ; Shehzadi, Tahira ; Khurram Azeem Hashmi ; Stricker, Didier ; Afzal, Muhammad Zeshan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30665764113</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Documents</topic><topic>Image enhancement</topic><topic>Labels</topic><topic>Layouts</topic><topic>Masks</topic><topic>Object recognition</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Talha Uddin Sheikh</creatorcontrib><creatorcontrib>Shehzadi, Tahira</creatorcontrib><creatorcontrib>Khurram Azeem Hashmi</creatorcontrib><creatorcontrib>Stricker, Didier</creatorcontrib><creatorcontrib>Afzal, Muhammad Zeshan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Talha Uddin Sheikh</au><au>Shehzadi, Tahira</au><au>Khurram Azeem Hashmi</au><au>Stricker, Didier</au><au>Afzal, Muhammad Zeshan</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>UnSupDLA: Towards Unsupervised Document Layout Analysis</atitle><jtitle>arXiv.org</jtitle><date>2024-06-10</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from the unlabeled document images. These masks are then used to train a detector, enhancing object detection and segmentation performance. The model's effectiveness is further amplified through several unsupervised training iterations, continuously refining its performance. This approach significantly advances document layout analysis, particularly precision and efficiency, without labels.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-06
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3066576411
source	Free E- Journals
subjects	Documents Image enhancement Labels Layouts Masks Object recognition Training
title	UnSupDLA: Towards Unsupervised Document Layout Analysis
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T23%3A47%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=UnSupDLA:%20Towards%20Unsupervised%20Document%20Layout%20Analysis&rft.jtitle=arXiv.org&rft.au=Talha%20Uddin%20Sheikh&rft.date=2024-06-10&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3066576411%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3066576411&rft_id=info:pmid/&rfr_iscdi=true