Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset

A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2019-01, Vol.7, p.1-1
Hauptverfasser:	Ohyama, Wataru, Suzuki, Masakazu, Uchida, Seiichi
Format:	Artikel
Sprache:	eng
Schlagworte:	Character recognition Computer architecture convolutional Neural networks Datasets document image analysis Image segmentation Mathematical analysis mathematical expression detection Medical imaging neural networks object detection Optical character recognition Performance enhancement Retraining Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE access
container_volume	7
creator	Ohyama, Wataru Suzuki, Masakazu Uchida, Seiichi
description	A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.
doi_str_mv	10.1109/ACCESS.2019.2945825
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2455597652</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8861044</ieee_id><doaj_id>oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e</doaj_id><sourcerecordid>2455597652</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</originalsourceid><addsrcrecordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455597652</pqid></control><display><type>article</type><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creator><creatorcontrib>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creatorcontrib><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2945825</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Character recognition ; Computer architecture ; convolutional Neural networks ; Datasets ; document image analysis ; Image segmentation ; Mathematical analysis ; mathematical expression detection ; Medical imaging ; neural networks ; object detection ; Optical character recognition ; Performance enhancement ; Retraining ; Training</subject><ispartof>IEEE access, 2019-01, Vol.7, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</citedby><cites>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</cites><orcidid>0000-0002-3661-0615</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8861044$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>315,781,785,865,2103,27638,27929,27930,54938</link.rule.ids></links><search><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><title>IEEE access</title><addtitle>Access</addtitle><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><subject>Character recognition</subject><subject>Computer architecture</subject><subject>convolutional Neural networks</subject><subject>Datasets</subject><subject>document image analysis</subject><subject>Image segmentation</subject><subject>Mathematical analysis</subject><subject>mathematical expression detection</subject><subject>Medical imaging</subject><subject>neural networks</subject><subject>object detection</subject><subject>Optical character recognition</subject><subject>Performance enhancement</subject><subject>Retraining</subject><subject>Training</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</recordid><startdate>20190101</startdate><enddate>20190101</enddate><creator>Ohyama, Wataru</creator><creator>Suzuki, Masakazu</creator><creator>Uchida, Seiichi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid></search><sort><creationdate>20190101</creationdate><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><author>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Character recognition</topic><topic>Computer architecture</topic><topic>convolutional Neural networks</topic><topic>Datasets</topic><topic>document image analysis</topic><topic>Image segmentation</topic><topic>Mathematical analysis</topic><topic>mathematical expression detection</topic><topic>Medical imaging</topic><topic>neural networks</topic><topic>object detection</topic><topic>Optical character recognition</topic><topic>Performance enhancement</topic><topic>Retraining</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ohyama, Wataru</au><au>Suzuki, Masakazu</au><au>Uchida, Seiichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019-01-01</date><risdate>2019</risdate><volume>7</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2945825</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2019-01, Vol.7, p.1-1
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2455597652
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Character recognition Computer architecture convolutional Neural networks Datasets document image analysis Image segmentation Mathematical analysis mathematical expression detection Medical imaging neural networks object detection Optical character recognition Performance enhancement Retraining Training
title	Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-11T20%3A08%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Detecting%20Mathematical%20Expressions%20in%20Scientific%20Document%20Images%20Using%20a%20U-Net%20Trained%20on%20a%20Diverse%20Dataset&rft.jtitle=IEEE%20access&rft.au=Ohyama,%20Wataru&rft.date=2019-01-01&rft.volume=7&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2945825&rft_dat=%3Cproquest_ieee_%3E2455597652%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455597652&rft_id=info:pmid/&rft_ieee_id=8861044&rft_doaj_id=oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e&rfr_iscdi=true