Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset
A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019-01, Vol.7, p.1-1 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE access |
container_volume | 7 |
creator | Ohyama, Wataru Suzuki, Masakazu Uchida, Seiichi |
description | A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents. |
doi_str_mv | 10.1109/ACCESS.2019.2945825 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2455597652</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8861044</ieee_id><doaj_id>oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e</doaj_id><sourcerecordid>2455597652</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</originalsourceid><addsrcrecordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455597652</pqid></control><display><type>article</type><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creator><creatorcontrib>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</creatorcontrib><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2945825</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Character recognition ; Computer architecture ; convolutional Neural networks ; Datasets ; document image analysis ; Image segmentation ; Mathematical analysis ; mathematical expression detection ; Medical imaging ; neural networks ; object detection ; Optical character recognition ; Performance enhancement ; Retraining ; Training</subject><ispartof>IEEE access, 2019-01, Vol.7, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</citedby><cites>FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</cites><orcidid>0000-0002-3661-0615</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8861044$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>315,781,785,865,2103,27638,27929,27930,54938</link.rule.ids></links><search><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><title>IEEE access</title><addtitle>Access</addtitle><description>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</description><subject>Character recognition</subject><subject>Computer architecture</subject><subject>convolutional Neural networks</subject><subject>Datasets</subject><subject>document image analysis</subject><subject>Image segmentation</subject><subject>Mathematical analysis</subject><subject>mathematical expression detection</subject><subject>Medical imaging</subject><subject>neural networks</subject><subject>object detection</subject><subject>Optical character recognition</subject><subject>Performance enhancement</subject><subject>Retraining</subject><subject>Training</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1vEzEQXSGQqEp_QS-WOG_wt73HKgkQqcAhzdny2uPgkKyD7SD49zhsVTGXmXma92ZGr-vuCV4QgocPD8vlertdUEyGBR240FS86m4okUPPBJOv_6vfdnelHHAL3SChbrofK6jgapz26Iut3-Fka3T2iNa_zxlKiWkqKE5o6yJMNYbo0Cq5y6k1aHOyeyhoV65ki3b9V6joKds4gUdpatAq_oJcAK1stQXqu-5NsMcCd8_5ttt9XD8tP_eP3z5tlg-PveOK134QVCnMxkBHxRx3njHtdZBhlIOUDpzHmmPOiPCUUxH4qKhVFiQhjAfN2W23mXV9sgdzzvFk8x-TbDT_gJT3xub25hGMG2ngVmnuuebU-cExKz1lknDsPYGm9X7WOuf08wKlmkO65KmdbygXQgxKCtqm2DzlciolQ3jZSrC5mmRmk8zVJPNsUmPdz6wIAC8MrSXBnLO_8aeMsQ</recordid><startdate>20190101</startdate><enddate>20190101</enddate><creator>Ohyama, Wataru</creator><creator>Suzuki, Masakazu</creator><creator>Uchida, Seiichi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid></search><sort><creationdate>20190101</creationdate><title>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</title><author>Ohyama, Wataru ; Suzuki, Masakazu ; Uchida, Seiichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-9527703bf2b73c4cd338d8f6fb6966cecd08404315d2425f4b72a7ae61134f843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Character recognition</topic><topic>Computer architecture</topic><topic>convolutional Neural networks</topic><topic>Datasets</topic><topic>document image analysis</topic><topic>Image segmentation</topic><topic>Mathematical analysis</topic><topic>mathematical expression detection</topic><topic>Medical imaging</topic><topic>neural networks</topic><topic>object detection</topic><topic>Optical character recognition</topic><topic>Performance enhancement</topic><topic>Retraining</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ohyama, Wataru</creatorcontrib><creatorcontrib>Suzuki, Masakazu</creatorcontrib><creatorcontrib>Uchida, Seiichi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ohyama, Wataru</au><au>Suzuki, Masakazu</au><au>Uchida, Seiichi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019-01-01</date><risdate>2019</risdate><volume>7</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2945825</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-3661-0615</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2019-01, Vol.7, p.1-1 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2455597652 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Character recognition Computer architecture convolutional Neural networks Datasets document image analysis Image segmentation Mathematical analysis mathematical expression detection Medical imaging neural networks object detection Optical character recognition Performance enhancement Retraining Training |
title | Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-11T20%3A08%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Detecting%20Mathematical%20Expressions%20in%20Scientific%20Document%20Images%20Using%20a%20U-Net%20Trained%20on%20a%20Diverse%20Dataset&rft.jtitle=IEEE%20access&rft.au=Ohyama,%20Wataru&rft.date=2019-01-01&rft.volume=7&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2945825&rft_dat=%3Cproquest_ieee_%3E2455597652%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455597652&rft_id=info:pmid/&rft_ieee_id=8861044&rft_doaj_id=oai_doaj_org_article_cb2f4a784d4842cd9c3a6d236140dd1e&rfr_iscdi=true |