Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset

A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019-01, Vol.7, p.1-1
Hauptverfasser: Ohyama, Wataru, Suzuki, Masakazu, Uchida, Seiichi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue
container_start_page 1
container_title IEEE access
container_volume 7
creator Ohyama, Wataru
Suzuki, Masakazu
Uchida, Seiichi
description A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.
doi_str_mv 10.1109/ACCESS.2019.2945825
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2455597652</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8861044</ieee_id><doaj_id>oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e</doaj_id><sourcerecordid>2455597652</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</originalsourceid><addsrcrecordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455597652</pqid></control><display><type>article</type><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creator><creatorcontrib>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creatorcontrib><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2945825</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Character recognition ; Computer architecture ; convolutional Neural networks ; Datasets ; document image analysis ; Image segmentation ; Mathematical analysis ; mathematical expression detection ; Medical imaging ; neural networks ; object detection ; Optical character recognition ; Performance enhancement ; Retraining ; Training</subject><ispartof>IEEE access, 2019-01, Vol.7, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</citedby><cites>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</cites><orcidid>0000-0002-3661-0615</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8861044$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>315,781,785,865,2103,27638,27929,27930,54938</link.rule.ids></links><search><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><title>IEEE access</title><addtitle>Access</addtitle><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><subject>Character recognition</subject><subject>Computer architecture</subject><subject>convolutional Neural networks</subject><subject>Datasets</subject><subject>document image analysis</subject><subject>Image segmentation</subject><subject>Mathematical analysis</subject><subject>mathematical expression detection</subject><subject>Medical imaging</subject><subject>neural networks</subject><subject>object detection</subject><subject>Optical character recognition</subject><subject>Performance enhancement</subject><subject>Retraining</subject><subject>Training</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</recordid><startdate>20190101</startdate><enddate>20190101</enddate><creator>Ohyama, Wataru</creator><creator>Suzuki, Masakazu</creator><creator>Uchida, Seiichi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid></search><sort><creationdate>20190101</creationdate><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><author>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Character recognition</topic><topic>Computer architecture</topic><topic>convolutional Neural networks</topic><topic>Datasets</topic><topic>document image analysis</topic><topic>Image segmentation</topic><topic>Mathematical analysis</topic><topic>mathematical expression detection</topic><topic>Medical imaging</topic><topic>neural networks</topic><topic>object detection</topic><topic>Optical character recognition</topic><topic>Performance enhancement</topic><topic>Retraining</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ohyama, Wataru</au><au>Suzuki, Masakazu</au><au>Uchida, Seiichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019-01-01</date><risdate>2019</risdate><volume>7</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2945825</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2019-01, Vol.7, p.1-1
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2455597652
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Character recognition
Computer architecture
convolutional Neural networks
Datasets
document image analysis
Image segmentation
Mathematical analysis
mathematical expression detection
Medical imaging
neural networks
object detection
Optical character recognition
Performance enhancement
Retraining
Training
title Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-11T20%3A08%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Detecting%20Mathematical%20Expressions%20in%20Scientific%20Document%20Images%20Using%20a%20U-Net%20Trained%20on%20a%20Diverse%20Dataset&rft.jtitle=IEEE%20access&rft.au=Ohyama,%20Wataru&rft.date=2019-01-01&rft.volume=7&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2945825&rft_dat=%3Cproquest_ieee_%3E2455597652%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455597652&rft_id=info:pmid/&rft_ieee_id=8861044&rft_doaj_id=oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e&rfr_iscdi=true