DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION

Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GALIC, SASA, RADAKOVIC, BOGDAN, TODIC, NIKOLA
Format: Patent
Sprache:eng ; fre
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator GALIC, SASA
RADAKOVIC, BOGDAN
TODIC, NIKOLA
description Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters "sit") and mean line (the line under which most of the characters "hang"). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CA2789813A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CA2789813A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CA2789813A13</originalsourceid><addsrcrecordid>eNrjZLB28XcO9XX1C1EIcHR3VQh2dQdxHEM8_f0UPP0U_ANCPJ0dfRScPRyDHJ1DXIMUglyd_d39PEEKeBhY0xJzilN5oTQ3g4Kba4izh25qQX58anFBYnJqXmpJvLOjkbmFpYWhsaOhMRFKAJPdKO8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION</title><source>esp@cenet</source><creator>GALIC, SASA ; RADAKOVIC, BOGDAN ; TODIC, NIKOLA</creator><creatorcontrib>GALIC, SASA ; RADAKOVIC, BOGDAN ; TODIC, NIKOLA</creatorcontrib><description>Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters "sit") and mean line (the line under which most of the characters "hang"). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.</description><language>eng ; fre</language><subject>CALCULATING ; COMPUTING ; COUNTING ; HANDLING RECORD CARRIERS ; PHYSICS ; PRESENTATION OF DATA ; RECOGNITION OF DATA ; RECORD CARRIERS</subject><creationdate>2011</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20110915&amp;DB=EPODOC&amp;CC=CA&amp;NR=2789813A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20110915&amp;DB=EPODOC&amp;CC=CA&amp;NR=2789813A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>GALIC, SASA</creatorcontrib><creatorcontrib>RADAKOVIC, BOGDAN</creatorcontrib><creatorcontrib>TODIC, NIKOLA</creatorcontrib><title>DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION</title><description>Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters "sit") and mean line (the line under which most of the characters "hang"). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>HANDLING RECORD CARRIERS</subject><subject>PHYSICS</subject><subject>PRESENTATION OF DATA</subject><subject>RECOGNITION OF DATA</subject><subject>RECORD CARRIERS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2011</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB28XcO9XX1C1EIcHR3VQh2dQdxHEM8_f0UPP0U_ANCPJ0dfRScPRyDHJ1DXIMUglyd_d39PEEKeBhY0xJzilN5oTQ3g4Kba4izh25qQX58anFBYnJqXmpJvLOjkbmFpYWhsaOhMRFKAJPdKO8</recordid><startdate>20110915</startdate><enddate>20110915</enddate><creator>GALIC, SASA</creator><creator>RADAKOVIC, BOGDAN</creator><creator>TODIC, NIKOLA</creator><scope>EVB</scope></search><sort><creationdate>20110915</creationdate><title>DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION</title><author>GALIC, SASA ; RADAKOVIC, BOGDAN ; TODIC, NIKOLA</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CA2789813A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre</language><creationdate>2011</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>HANDLING RECORD CARRIERS</topic><topic>PHYSICS</topic><topic>PRESENTATION OF DATA</topic><topic>RECOGNITION OF DATA</topic><topic>RECORD CARRIERS</topic><toplevel>online_resources</toplevel><creatorcontrib>GALIC, SASA</creatorcontrib><creatorcontrib>RADAKOVIC, BOGDAN</creatorcontrib><creatorcontrib>TODIC, NIKOLA</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>GALIC, SASA</au><au>RADAKOVIC, BOGDAN</au><au>TODIC, NIKOLA</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION</title><date>2011-09-15</date><risdate>2011</risdate><abstract>Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters "sit") and mean line (the line under which most of the characters "hang"). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng ; fre
recordid cdi_epo_espacenet_CA2789813A1
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
HANDLING RECORD CARRIERS
PHYSICS
PRESENTATION OF DATA
RECOGNITION OF DATA
RECORD CARRIERS
title DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T11%3A13%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=GALIC,%20SASA&rft.date=2011-09-15&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECA2789813A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true