Document layout extraction
Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coo...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Dresevic, Bodin Trutner, Oren Tomasevic, Sasa Uzelac, Aleksandar Lukacevic, Dejan |
description | Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data. |
format | Patent |
fullrecord | <record><control><sourceid>uspatents_EFH</sourceid><recordid>TN_cdi_uspatents_grants_08250469</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>08250469</sourcerecordid><originalsourceid>FETCH-uspatents_grants_082504693</originalsourceid><addsrcrecordid>eNrjZJByyU8uzU3NK1HISazMLy1RSK0oKUpMLsnMz-NhYE1LzClO5YXS3AwKbq4hzh66pcUFiSVALcXx6UWJIMrAwsjUwMTM0pgIJQBU6CRR</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Document layout extraction</title><source>USPTO Issued Patents</source><creator>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan</creator><creatorcontrib>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan ; Microsoft Corporation</creatorcontrib><description>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</description><language>eng</language><creationdate>2012</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8250469$$EPDF$$P50$$Guspatents$$Hfree_for_read</linktopdf><link.rule.ids>230,308,776,798,881,64012</link.rule.ids><linktorsrc>$$Uhttps://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8250469$$EView_record_in_USPTO$$FView_record_in_$$GUSPTO$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Dresevic, Bodin</creatorcontrib><creatorcontrib>Trutner, Oren</creatorcontrib><creatorcontrib>Tomasevic, Sasa</creatorcontrib><creatorcontrib>Uzelac, Aleksandar</creatorcontrib><creatorcontrib>Lukacevic, Dejan</creatorcontrib><creatorcontrib>Microsoft Corporation</creatorcontrib><title>Document layout extraction</title><description>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</description><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2012</creationdate><recordtype>patent</recordtype><sourceid>EFH</sourceid><recordid>eNrjZJByyU8uzU3NK1HISazMLy1RSK0oKUpMLsnMz-NhYE1LzClO5YXS3AwKbq4hzh66pcUFiSVALcXx6UWJIMrAwsjUwMTM0pgIJQBU6CRR</recordid><startdate>20120821</startdate><enddate>20120821</enddate><creator>Dresevic, Bodin</creator><creator>Trutner, Oren</creator><creator>Tomasevic, Sasa</creator><creator>Uzelac, Aleksandar</creator><creator>Lukacevic, Dejan</creator><scope>EFH</scope></search><sort><creationdate>20120821</creationdate><title>Document layout extraction</title><author>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-uspatents_grants_082504693</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2012</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Dresevic, Bodin</creatorcontrib><creatorcontrib>Trutner, Oren</creatorcontrib><creatorcontrib>Tomasevic, Sasa</creatorcontrib><creatorcontrib>Uzelac, Aleksandar</creatorcontrib><creatorcontrib>Lukacevic, Dejan</creatorcontrib><creatorcontrib>Microsoft Corporation</creatorcontrib><collection>USPTO Issued Patents</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dresevic, Bodin</au><au>Trutner, Oren</au><au>Tomasevic, Sasa</au><au>Uzelac, Aleksandar</au><au>Lukacevic, Dejan</au><aucorp>Microsoft Corporation</aucorp><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Document layout extraction</title><date>2012-08-21</date><risdate>2012</risdate><abstract>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_uspatents_grants_08250469 |
source | USPTO Issued Patents |
title | Document layout extraction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T20%3A21%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-uspatents_EFH&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Dresevic,%20Bodin&rft.aucorp=Microsoft%20Corporation&rft.date=2012-08-21&rft_id=info:doi/&rft_dat=%3Cuspatents_EFH%3E08250469%3C/uspatents_EFH%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |