A Restoration and Segmentation Unit for the Historic Persian Documents

This paper aims to provide a document restoration and segmentation algorithm for the Historic Middle Persian or Pahlavi manuscripts. The proposed algorithm uses the mathematical morphology and connected component concept to segment the line, word, and character overlapped in the Middle-age Persian d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Alirezaee, Shahpour, Fard, Alireza Shayesteh, Aghaeinia, Hassan, Faez, Karim
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 680
container_issue
container_start_page 674
container_title
container_volume
creator Alirezaee, Shahpour
Fard, Alireza Shayesteh
Aghaeinia, Hassan
Faez, Karim
description This paper aims to provide a document restoration and segmentation algorithm for the Historic Middle Persian or Pahlavi manuscripts. The proposed algorithm uses the mathematical morphology and connected component concept to segment the line, word, and character overlapped in the Middle-age Persian documents in preparation for OCR application. To evaluate the performance of the restoration algorithm, 200 pages of the Pahlavi documents are used as experimental data in our test. Numerical results indicate that the proposed algorithm can remove the noise and destructive effects. The results also show 99.14% accuracy on the baseline detection, 97.35% accuracy on the text line extraction and removing other lines overlaps, and 99.5% accuracy for segmenting the extracted text lines to their components.
doi_str_mv 10.1007/11558484_85
format Conference Proceeding
fullrecord <record><control><sourceid>pascalfrancis_sprin</sourceid><recordid>TN_cdi_pascalfrancis_primary_17324917</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>17324917</sourcerecordid><originalsourceid>FETCH-LOGICAL-p219t-86b457b3150407da80ab76b8fd5d941f0fd01594204a4b3fd2ba28be2494a41f3</originalsourceid><addsrcrecordid>eNpNkE1LAzEQhuMXWGpP_oFcPHhYzeRjkxxLtVYoKGrPIdlN6mqbLcl68N-bUhHnMAPv-zDDvAhdArkBQuQtgBCKK26UOEITLRUTnDBKeM2O0QhqgIoxrk_-PKqLTU_RiDBCKy05O0eTnD9IKQZCghih-RS_-Dz0yQ5dH7GNLX71662Pw0FYxW7AoU94ePd40e3JrsHPPuXORnzXN197Nl-gs2A32U9-5xit5vdvs0W1fHp4nE2X1Y6CHipVOy6kK8cJJ7K1ilgna6dCK1rNIZDQEhCal6csdyy01FmqnKdcFwECG6Orw96dzY3dhGRj02WzS93Wpm8DkhW09DG6PnC5WHHtk3F9_5kNELPP0vzLkv0An_1foA</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A Restoration and Segmentation Unit for the Historic Persian Documents</title><source>Springer Books</source><creator>Alirezaee, Shahpour ; Fard, Alireza Shayesteh ; Aghaeinia, Hassan ; Faez, Karim</creator><contributor>Blanc-Talon, Jacques ; Popescu, Dan ; Philips, Wilfried ; Scheunders, Paul</contributor><creatorcontrib>Alirezaee, Shahpour ; Fard, Alireza Shayesteh ; Aghaeinia, Hassan ; Faez, Karim ; Blanc-Talon, Jacques ; Popescu, Dan ; Philips, Wilfried ; Scheunders, Paul</creatorcontrib><description>This paper aims to provide a document restoration and segmentation algorithm for the Historic Middle Persian or Pahlavi manuscripts. The proposed algorithm uses the mathematical morphology and connected component concept to segment the line, word, and character overlapped in the Middle-age Persian documents in preparation for OCR application. To evaluate the performance of the restoration algorithm, 200 pages of the Pahlavi documents are used as experimental data in our test. Numerical results indicate that the proposed algorithm can remove the noise and destructive effects. The results also show 99.14% accuracy on the baseline detection, 97.35% accuracy on the text line extraction and removing other lines overlaps, and 99.5% accuracy for segmenting the extracted text lines to their components.</description><edition>1ère éd</edition><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 9783540290322</identifier><identifier>ISBN: 354029032X</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 9783540320463</identifier><identifier>EISBN: 3540320466</identifier><identifier>DOI: 10.1007/11558484_85</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Applied sciences ; Artificial intelligence ; Computer science; control theory; systems ; Electrical Engineer Department ; Exact sciences and technology ; Noise Removal ; Pattern recognition. Digital image processing. Computational geometry ; Renyi Entropy ; Speech and sound recognition and synthesis. Linguistics ; Text Line ; Word Segmentation</subject><ispartof>Advanced Concepts for Intelligent Vision Systems, 2005, p.674-680</ispartof><rights>Springer-Verlag Berlin Heidelberg 2005</rights><rights>2006 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/11558484_85$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/11558484_85$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>309,310,779,780,784,789,790,793,4050,4051,27925,38255,41442,42511</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=17324917$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><contributor>Blanc-Talon, Jacques</contributor><contributor>Popescu, Dan</contributor><contributor>Philips, Wilfried</contributor><contributor>Scheunders, Paul</contributor><creatorcontrib>Alirezaee, Shahpour</creatorcontrib><creatorcontrib>Fard, Alireza Shayesteh</creatorcontrib><creatorcontrib>Aghaeinia, Hassan</creatorcontrib><creatorcontrib>Faez, Karim</creatorcontrib><title>A Restoration and Segmentation Unit for the Historic Persian Documents</title><title>Advanced Concepts for Intelligent Vision Systems</title><description>This paper aims to provide a document restoration and segmentation algorithm for the Historic Middle Persian or Pahlavi manuscripts. The proposed algorithm uses the mathematical morphology and connected component concept to segment the line, word, and character overlapped in the Middle-age Persian documents in preparation for OCR application. To evaluate the performance of the restoration algorithm, 200 pages of the Pahlavi documents are used as experimental data in our test. Numerical results indicate that the proposed algorithm can remove the noise and destructive effects. The results also show 99.14% accuracy on the baseline detection, 97.35% accuracy on the text line extraction and removing other lines overlaps, and 99.5% accuracy for segmenting the extracted text lines to their components.</description><subject>Applied sciences</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Electrical Engineer Department</subject><subject>Exact sciences and technology</subject><subject>Noise Removal</subject><subject>Pattern recognition. Digital image processing. Computational geometry</subject><subject>Renyi Entropy</subject><subject>Speech and sound recognition and synthesis. Linguistics</subject><subject>Text Line</subject><subject>Word Segmentation</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>9783540290322</isbn><isbn>354029032X</isbn><isbn>9783540320463</isbn><isbn>3540320466</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2005</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNpNkE1LAzEQhuMXWGpP_oFcPHhYzeRjkxxLtVYoKGrPIdlN6mqbLcl68N-bUhHnMAPv-zDDvAhdArkBQuQtgBCKK26UOEITLRUTnDBKeM2O0QhqgIoxrk_-PKqLTU_RiDBCKy05O0eTnD9IKQZCghih-RS_-Dz0yQ5dH7GNLX71662Pw0FYxW7AoU94ePd40e3JrsHPPuXORnzXN197Nl-gs2A32U9-5xit5vdvs0W1fHp4nE2X1Y6CHipVOy6kK8cJJ7K1ilgna6dCK1rNIZDQEhCal6csdyy01FmqnKdcFwECG6Orw96dzY3dhGRj02WzS93Wpm8DkhW09DG6PnC5WHHtk3F9_5kNELPP0vzLkv0An_1foA</recordid><startdate>2005</startdate><enddate>2005</enddate><creator>Alirezaee, Shahpour</creator><creator>Fard, Alireza Shayesteh</creator><creator>Aghaeinia, Hassan</creator><creator>Faez, Karim</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><scope>IQODW</scope></search><sort><creationdate>2005</creationdate><title>A Restoration and Segmentation Unit for the Historic Persian Documents</title><author>Alirezaee, Shahpour ; Fard, Alireza Shayesteh ; Aghaeinia, Hassan ; Faez, Karim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p219t-86b457b3150407da80ab76b8fd5d941f0fd01594204a4b3fd2ba28be2494a41f3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Applied sciences</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Electrical Engineer Department</topic><topic>Exact sciences and technology</topic><topic>Noise Removal</topic><topic>Pattern recognition. Digital image processing. Computational geometry</topic><topic>Renyi Entropy</topic><topic>Speech and sound recognition and synthesis. Linguistics</topic><topic>Text Line</topic><topic>Word Segmentation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alirezaee, Shahpour</creatorcontrib><creatorcontrib>Fard, Alireza Shayesteh</creatorcontrib><creatorcontrib>Aghaeinia, Hassan</creatorcontrib><creatorcontrib>Faez, Karim</creatorcontrib><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alirezaee, Shahpour</au><au>Fard, Alireza Shayesteh</au><au>Aghaeinia, Hassan</au><au>Faez, Karim</au><au>Blanc-Talon, Jacques</au><au>Popescu, Dan</au><au>Philips, Wilfried</au><au>Scheunders, Paul</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A Restoration and Segmentation Unit for the Historic Persian Documents</atitle><btitle>Advanced Concepts for Intelligent Vision Systems</btitle><date>2005</date><risdate>2005</risdate><spage>674</spage><epage>680</epage><pages>674-680</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>9783540290322</isbn><isbn>354029032X</isbn><eisbn>9783540320463</eisbn><eisbn>3540320466</eisbn><abstract>This paper aims to provide a document restoration and segmentation algorithm for the Historic Middle Persian or Pahlavi manuscripts. The proposed algorithm uses the mathematical morphology and connected component concept to segment the line, word, and character overlapped in the Middle-age Persian documents in preparation for OCR application. To evaluate the performance of the restoration algorithm, 200 pages of the Pahlavi documents are used as experimental data in our test. Numerical results indicate that the proposed algorithm can remove the noise and destructive effects. The results also show 99.14% accuracy on the baseline detection, 97.35% accuracy on the text line extraction and removing other lines overlaps, and 99.5% accuracy for segmenting the extracted text lines to their components.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/11558484_85</doi><tpages>7</tpages><edition>1ère éd</edition></addata></record>
fulltext fulltext
identifier ISSN: 0302-9743
ispartof Advanced Concepts for Intelligent Vision Systems, 2005, p.674-680
issn 0302-9743
1611-3349
language eng
recordid cdi_pascalfrancis_primary_17324917
source Springer Books
subjects Applied sciences
Artificial intelligence
Computer science
control theory
systems
Electrical Engineer Department
Exact sciences and technology
Noise Removal
Pattern recognition. Digital image processing. Computational geometry
Renyi Entropy
Speech and sound recognition and synthesis. Linguistics
Text Line
Word Segmentation
title A Restoration and Segmentation Unit for the Historic Persian Documents
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T18%3A01%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20Restoration%20and%20Segmentation%20Unit%20for%20the%20Historic%20Persian%20Documents&rft.btitle=Advanced%20Concepts%20for%20Intelligent%20Vision%20Systems&rft.au=Alirezaee,%20Shahpour&rft.date=2005&rft.spage=674&rft.epage=680&rft.pages=674-680&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=9783540290322&rft.isbn_list=354029032X&rft_id=info:doi/10.1007/11558484_85&rft_dat=%3Cpascalfrancis_sprin%3E17324917%3C/pascalfrancis_sprin%3E%3Curl%3E%3C/url%3E&rft.eisbn=9783540320463&rft.eisbn_list=3540320466&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true