ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME
Computer-storage media, computerized methods and systems for classifying character strings within electronic documents are provided. Initially, textual data, which includes one or more character strings, is extracted from an electronic version of a document, typically scanned from a physical documen...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | UZELAC ALEKSANDAR TRUTNER OREN DRESEVIC BODIN RADAKOVIC BOGDAN GALIC SASA LUKACEVIC DEJAN |
description | Computer-storage media, computerized methods and systems for classifying character strings within electronic documents are provided. Initially, textual data, which includes one or more character strings, is extracted from an electronic version of a document, typically scanned from a physical document utilizing optical character recognition. The textual data is received at a table-of-contents (TOC) engine that extracts semantic information from the textual data. Sub-engines within the TOC engine analyze the semantic information to determine at least one appropriate classification for character strings within the textual data. Labels selected from a predetermined set of TOC-architecture labels are appended to the character strings according to the appropriate classification. The character strings, and labels appended thereto, are stored in association with each other generating an electronic document file that includes enriched textual data. |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2009144277A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2009144277A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2009144277A13</originalsourceid><addsrcrecordid>eNrjZHB19XF1Dgny9_N0VghxdPJxVfB3U3D29wtx9QsJVgASQZEKzj6OwcGebp7OjiGe_n4Kjn4uCj6OTq4-nn7uCsHOHq6-rjwMrGmJOcWpvFCam0HZzTXE2UM3tSA_PrW4IDE5NS-1JD402MjAwNLQxMTI3NzR0Jg4VQAnxizf</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME</title><source>esp@cenet</source><creator>UZELAC ALEKSANDAR ; TRUTNER OREN ; DRESEVIC BODIN ; RADAKOVIC BOGDAN ; GALIC SASA ; LUKACEVIC DEJAN</creator><creatorcontrib>UZELAC ALEKSANDAR ; TRUTNER OREN ; DRESEVIC BODIN ; RADAKOVIC BOGDAN ; GALIC SASA ; LUKACEVIC DEJAN</creatorcontrib><description>Computer-storage media, computerized methods and systems for classifying character strings within electronic documents are provided. Initially, textual data, which includes one or more character strings, is extracted from an electronic version of a document, typically scanned from a physical document utilizing optical character recognition. The textual data is received at a table-of-contents (TOC) engine that extracts semantic information from the textual data. Sub-engines within the TOC engine analyze the semantic information to determine at least one appropriate classification for character strings within the textual data. Labels selected from a predetermined set of TOC-architecture labels are appended to the character strings according to the appropriate classification. The character strings, and labels appended thereto, are stored in association with each other generating an electronic document file that includes enriched textual data.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2009</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20090604&DB=EPODOC&CC=US&NR=2009144277A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20090604&DB=EPODOC&CC=US&NR=2009144277A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>UZELAC ALEKSANDAR</creatorcontrib><creatorcontrib>TRUTNER OREN</creatorcontrib><creatorcontrib>DRESEVIC BODIN</creatorcontrib><creatorcontrib>RADAKOVIC BOGDAN</creatorcontrib><creatorcontrib>GALIC SASA</creatorcontrib><creatorcontrib>LUKACEVIC DEJAN</creatorcontrib><title>ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME</title><description>Computer-storage media, computerized methods and systems for classifying character strings within electronic documents are provided. Initially, textual data, which includes one or more character strings, is extracted from an electronic version of a document, typically scanned from a physical document utilizing optical character recognition. The textual data is received at a table-of-contents (TOC) engine that extracts semantic information from the textual data. Sub-engines within the TOC engine analyze the semantic information to determine at least one appropriate classification for character strings within the textual data. Labels selected from a predetermined set of TOC-architecture labels are appended to the character strings according to the appropriate classification. The character strings, and labels appended thereto, are stored in association with each other generating an electronic document file that includes enriched textual data.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2009</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZHB19XF1Dgny9_N0VghxdPJxVfB3U3D29wtx9QsJVgASQZEKzj6OwcGebp7OjiGe_n4Kjn4uCj6OTq4-nn7uCsHOHq6-rjwMrGmJOcWpvFCam0HZzTXE2UM3tSA_PrW4IDE5NS-1JD402MjAwNLQxMTI3NzR0Jg4VQAnxizf</recordid><startdate>20090604</startdate><enddate>20090604</enddate><creator>UZELAC ALEKSANDAR</creator><creator>TRUTNER OREN</creator><creator>DRESEVIC BODIN</creator><creator>RADAKOVIC BOGDAN</creator><creator>GALIC SASA</creator><creator>LUKACEVIC DEJAN</creator><scope>EVB</scope></search><sort><creationdate>20090604</creationdate><title>ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME</title><author>UZELAC ALEKSANDAR ; TRUTNER OREN ; DRESEVIC BODIN ; RADAKOVIC BOGDAN ; GALIC SASA ; LUKACEVIC DEJAN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2009144277A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2009</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>UZELAC ALEKSANDAR</creatorcontrib><creatorcontrib>TRUTNER OREN</creatorcontrib><creatorcontrib>DRESEVIC BODIN</creatorcontrib><creatorcontrib>RADAKOVIC BOGDAN</creatorcontrib><creatorcontrib>GALIC SASA</creatorcontrib><creatorcontrib>LUKACEVIC DEJAN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>UZELAC ALEKSANDAR</au><au>TRUTNER OREN</au><au>DRESEVIC BODIN</au><au>RADAKOVIC BOGDAN</au><au>GALIC SASA</au><au>LUKACEVIC DEJAN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME</title><date>2009-06-04</date><risdate>2009</risdate><abstract>Computer-storage media, computerized methods and systems for classifying character strings within electronic documents are provided. Initially, textual data, which includes one or more character strings, is extracted from an electronic version of a document, typically scanned from a physical document utilizing optical character recognition. The textual data is received at a table-of-contents (TOC) engine that extracts semantic information from the textual data. Sub-engines within the TOC engine analyze the semantic information to determine at least one appropriate classification for character strings within the textual data. Labels selected from a predetermined set of TOC-architecture labels are appended to the character strings according to the appropriate classification. The character strings, and labels appended thereto, are stored in association with each other generating an electronic document file that includes enriched textual data.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_epo_espacenet_US2009144277A1 |
source | esp@cenet |
subjects | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS |
title | ELECTRONIC TABLE OF CONTENTS ENTRY CLASSIFICATION AND LABELING SCHEME |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T20%3A22%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=UZELAC%20ALEKSANDAR&rft.date=2009-06-04&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2009144277A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |