Document layout extraction

Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dresevic, Bodin, Trutner, Oren, Tomasevic, Sasa, Uzelac, Aleksandar, Lukacevic, Dejan
Format: Patent
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Dresevic, Bodin
Trutner, Oren
Tomasevic, Sasa
Uzelac, Aleksandar
Lukacevic, Dejan
description Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.
format Patent
fullrecord <record><control><sourceid>uspatents_EFH</sourceid><recordid>TN_cdi_uspatents_grants_08250469</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>08250469</sourcerecordid><originalsourceid>FETCH-uspatents_grants_082504693</originalsourceid><addsrcrecordid>eNrjZJByyU8uzU3NK1HISazMLy1RSK0oKUpMLsnMz-NhYE1LzClO5YXS3AwKbq4hzh66pcUFiSVALcXx6UWJIMrAwsjUwMTM0pgIJQBU6CRR</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Document layout extraction</title><source>USPTO Issued Patents</source><creator>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan</creator><creatorcontrib>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan ; Microsoft Corporation</creatorcontrib><description>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</description><language>eng</language><creationdate>2012</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8250469$$EPDF$$P50$$Guspatents$$Hfree_for_read</linktopdf><link.rule.ids>230,308,776,798,881,64012</link.rule.ids><linktorsrc>$$Uhttps://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8250469$$EView_record_in_USPTO$$FView_record_in_$$GUSPTO$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Dresevic, Bodin</creatorcontrib><creatorcontrib>Trutner, Oren</creatorcontrib><creatorcontrib>Tomasevic, Sasa</creatorcontrib><creatorcontrib>Uzelac, Aleksandar</creatorcontrib><creatorcontrib>Lukacevic, Dejan</creatorcontrib><creatorcontrib>Microsoft Corporation</creatorcontrib><title>Document layout extraction</title><description>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</description><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2012</creationdate><recordtype>patent</recordtype><sourceid>EFH</sourceid><recordid>eNrjZJByyU8uzU3NK1HISazMLy1RSK0oKUpMLsnMz-NhYE1LzClO5YXS3AwKbq4hzh66pcUFiSVALcXx6UWJIMrAwsjUwMTM0pgIJQBU6CRR</recordid><startdate>20120821</startdate><enddate>20120821</enddate><creator>Dresevic, Bodin</creator><creator>Trutner, Oren</creator><creator>Tomasevic, Sasa</creator><creator>Uzelac, Aleksandar</creator><creator>Lukacevic, Dejan</creator><scope>EFH</scope></search><sort><creationdate>20120821</creationdate><title>Document layout extraction</title><author>Dresevic, Bodin ; Trutner, Oren ; Tomasevic, Sasa ; Uzelac, Aleksandar ; Lukacevic, Dejan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-uspatents_grants_082504693</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2012</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Dresevic, Bodin</creatorcontrib><creatorcontrib>Trutner, Oren</creatorcontrib><creatorcontrib>Tomasevic, Sasa</creatorcontrib><creatorcontrib>Uzelac, Aleksandar</creatorcontrib><creatorcontrib>Lukacevic, Dejan</creatorcontrib><creatorcontrib>Microsoft Corporation</creatorcontrib><collection>USPTO Issued Patents</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dresevic, Bodin</au><au>Trutner, Oren</au><au>Tomasevic, Sasa</au><au>Uzelac, Aleksandar</au><au>Lukacevic, Dejan</au><aucorp>Microsoft Corporation</aucorp><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Document layout extraction</title><date>2012-08-21</date><risdate>2012</risdate><abstract>Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_uspatents_grants_08250469
source USPTO Issued Patents
title Document layout extraction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T20%3A21%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-uspatents_EFH&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Dresevic,%20Bodin&rft.aucorp=Microsoft%20Corporation&rft.date=2012-08-21&rft_id=info:doi/&rft_dat=%3Cuspatents_EFH%3E08250469%3C/uspatents_EFH%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true