Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents

Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coh...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of imaging 2022-10, Vol.8 (10)
Hauptverfasser:	Büttner, Jochen, Martinetz, Julius, El-Hajj, Hassan, Valleriani, Matteo
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	10
container_start_page
container_title	Journal of imaging
container_volume	8
creator	Büttner, Jochen Martinetz, Julius El-Hajj, Hassan Valleriani, Matteo
description	Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies.
doi_str_mv	10.3390/jimaging8100285
format	Article
fullrecord	<record><control><sourceid>pubmed</sourceid><recordid>TN_cdi_pubmed_primary_36286379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>36286379</sourcerecordid><originalsourceid>FETCH-pubmed_primary_362863793</originalsourceid><addsrcrecordid>eNqFjr0KwjAURoMgVtTZTe4LVNNef1pXW-muiE4S41UjbVKSdPDtLaKzw8cHhzMcxsYRnyKmfPZUlbgrfU8izuNk0WH9GCMM54jHgI2ce3LOozRul_ZYgMs4WeIq7bPTxljIiGoQ-gr-QbAT0pqLcdJAJrxw5Net4El6ZTSYGxyUa0QJeUkVae9AaSiU88Yq2eLMyObDh6x7E6Wj0fcHbLLN95sirJtLRddzbdtk-zr_WvCv8AbZTkYg</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents</title><source>DOAJ Directory of Open Access Journals</source><source>PubMed Central Open Access</source><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Büttner, Jochen ; Martinetz, Julius ; El-Hajj, Hassan ; Valleriani, Matteo</creator><creatorcontrib>Büttner, Jochen ; Martinetz, Julius ; El-Hajj, Hassan ; Valleriani, Matteo</creatorcontrib><description>Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies.</description><identifier>EISSN: 2313-433X</identifier><identifier>DOI: 10.3390/jimaging8100285</identifier><identifier>PMID: 36286379</identifier><language>eng</language><publisher>Switzerland</publisher><ispartof>Journal of imaging, 2022-10, Vol.8 (10)</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-8305-7138 ; 0000-0001-6931-7709 ; 0000-0002-0406-7777 ; 0000-0003-1758-3153</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,860,27903,27904</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36286379$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Büttner, Jochen</creatorcontrib><creatorcontrib>Martinetz, Julius</creatorcontrib><creatorcontrib>El-Hajj, Hassan</creatorcontrib><creatorcontrib>Valleriani, Matteo</creatorcontrib><title>Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents</title><title>Journal of imaging</title><addtitle>J Imaging</addtitle><description>Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies.</description><issn>2313-433X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqFjr0KwjAURoMgVtTZTe4LVNNef1pXW-muiE4S41UjbVKSdPDtLaKzw8cHhzMcxsYRnyKmfPZUlbgrfU8izuNk0WH9GCMM54jHgI2ce3LOozRul_ZYgMs4WeIq7bPTxljIiGoQ-gr-QbAT0pqLcdJAJrxw5Net4El6ZTSYGxyUa0QJeUkVae9AaSiU88Yq2eLMyObDh6x7E6Wj0fcHbLLN95sirJtLRddzbdtk-zr_WvCv8AbZTkYg</recordid><startdate>20221015</startdate><enddate>20221015</enddate><creator>Büttner, Jochen</creator><creator>Martinetz, Julius</creator><creator>El-Hajj, Hassan</creator><creator>Valleriani, Matteo</creator><scope>NPM</scope><orcidid>https://orcid.org/0000-0002-8305-7138</orcidid><orcidid>https://orcid.org/0000-0001-6931-7709</orcidid><orcidid>https://orcid.org/0000-0002-0406-7777</orcidid><orcidid>https://orcid.org/0000-0003-1758-3153</orcidid></search><sort><creationdate>20221015</creationdate><title>Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents</title><author>Büttner, Jochen ; Martinetz, Julius ; El-Hajj, Hassan ; Valleriani, Matteo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-pubmed_primary_362863793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Büttner, Jochen</creatorcontrib><creatorcontrib>Martinetz, Julius</creatorcontrib><creatorcontrib>El-Hajj, Hassan</creatorcontrib><creatorcontrib>Valleriani, Matteo</creatorcontrib><collection>PubMed</collection><jtitle>Journal of imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Büttner, Jochen</au><au>Martinetz, Julius</au><au>El-Hajj, Hassan</au><au>Valleriani, Matteo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents</atitle><jtitle>Journal of imaging</jtitle><addtitle>J Imaging</addtitle><date>2022-10-15</date><risdate>2022</risdate><volume>8</volume><issue>10</issue><eissn>2313-433X</eissn><abstract>Recent advances in object detection facilitated by deep learning have led to numerous solutions in a myriad of fields ranging from medical diagnosis to autonomous driving. However, historical research is yet to reap the benefits of such advances. This is generally due to the low number of large, coherent, and annotated datasets of historical documents, as well as the overwhelming focus on Optical Character Recognition to support the analysis of historical documents. In this paper, we highlight the importance of visual elements, in particular illustrations in historical documents, and offer a public multi-class historical visual element dataset based on the corpus. Additionally, we train an image extraction model based on YOLO architecture and publish it through a publicly available web-service to detect and extract multi-class images from historical documents in an effort to bridge the gap between traditional and computational approaches in historical studies.</abstract><cop>Switzerland</cop><pmid>36286379</pmid><doi>10.3390/jimaging8100285</doi><orcidid>https://orcid.org/0000-0002-8305-7138</orcidid><orcidid>https://orcid.org/0000-0001-6931-7709</orcidid><orcidid>https://orcid.org/0000-0002-0406-7777</orcidid><orcidid>https://orcid.org/0000-0003-1758-3153</orcidid></addata></record>
fulltext	fulltext
identifier	EISSN: 2313-433X
ispartof	Journal of imaging, 2022-10, Vol.8 (10)
issn	2313-433X
language	eng
recordid	cdi_pubmed_primary_36286379
source	DOAJ Directory of Open Access Journals; PubMed Central Open Access; MDPI - Multidisciplinary Digital Publishing Institute; EZB-FREE-00999 freely available EZB journals; PubMed Central
title	Cor Deep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A25%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cor%20Deep%20and%20the%20Sacrobosco%20Dataset:%20Detection%20of%20Visual%20Elements%20in%20Historical%20Documents&rft.jtitle=Journal%20of%20imaging&rft.au=B%C3%BCttner,%20Jochen&rft.date=2022-10-15&rft.volume=8&rft.issue=10&rft.eissn=2313-433X&rft_id=info:doi/10.3390/jimaging8100285&rft_dat=%3Cpubmed%3E36286379%3C/pubmed%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/36286379&rfr_iscdi=true