Residual context refinement network architecture for optical character recognition

Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Litman, Roee, Wu, Jonathan, Litman, Ron, Tsiper, Shahar, Anschel, Oron, Manmatha, Raghavan, Mazor, Shai
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Litman, Roee
Wu, Jonathan
Litman, Ron
Tsiper, Shahar
Anschel, Oron
Manmatha, Raghavan
Mazor, Shai
description Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US11308354B1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US11308354B1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US11308354B13</originalsourceid><addsrcrecordid>eNqNjMEKwjAQBXvxIOo_xA8QDFHwXFE8Vz2XsL7YYM2GzRb9fCv4AZ7mMjPTqmlQ4m3wvSFOircaQYgJTyQ1CfpieRgv1EUF6SAwgcVw1kjfpvPiSSFjRXxPUSOneTUJvi9Y_DirlsfDZX9aIXOLkj1hHLfXs7VuvXPbTW3dP84HruI5Vg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Residual context refinement network architecture for optical character recognition</title><source>esp@cenet</source><creator>Litman, Roee ; Wu, Jonathan ; Litman, Ron ; Tsiper, Shahar ; Anschel, Oron ; Manmatha, Raghavan ; Mazor, Shai</creator><creatorcontrib>Litman, Roee ; Wu, Jonathan ; Litman, Ron ; Tsiper, Shahar ; Anschel, Oron ; Manmatha, Raghavan ; Mazor, Shai</creatorcontrib><description>Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; HANDLING RECORD CARRIERS ; PHYSICS ; PRESENTATION OF DATA ; RECOGNITION OF DATA ; RECORD CARRIERS</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20220419&amp;DB=EPODOC&amp;CC=US&amp;NR=11308354B1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,777,882,25545,76296</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20220419&amp;DB=EPODOC&amp;CC=US&amp;NR=11308354B1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Litman, Roee</creatorcontrib><creatorcontrib>Wu, Jonathan</creatorcontrib><creatorcontrib>Litman, Ron</creatorcontrib><creatorcontrib>Tsiper, Shahar</creatorcontrib><creatorcontrib>Anschel, Oron</creatorcontrib><creatorcontrib>Manmatha, Raghavan</creatorcontrib><creatorcontrib>Mazor, Shai</creatorcontrib><title>Residual context refinement network architecture for optical character recognition</title><description>Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>HANDLING RECORD CARRIERS</subject><subject>PHYSICS</subject><subject>PRESENTATION OF DATA</subject><subject>RECOGNITION OF DATA</subject><subject>RECORD CARRIERS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2022</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjMEKwjAQBXvxIOo_xA8QDFHwXFE8Vz2XsL7YYM2GzRb9fCv4AZ7mMjPTqmlQ4m3wvSFOircaQYgJTyQ1CfpieRgv1EUF6SAwgcVw1kjfpvPiSSFjRXxPUSOneTUJvi9Y_DirlsfDZX9aIXOLkj1hHLfXs7VuvXPbTW3dP84HruI5Vg</recordid><startdate>20220419</startdate><enddate>20220419</enddate><creator>Litman, Roee</creator><creator>Wu, Jonathan</creator><creator>Litman, Ron</creator><creator>Tsiper, Shahar</creator><creator>Anschel, Oron</creator><creator>Manmatha, Raghavan</creator><creator>Mazor, Shai</creator><scope>EVB</scope></search><sort><creationdate>20220419</creationdate><title>Residual context refinement network architecture for optical character recognition</title><author>Litman, Roee ; Wu, Jonathan ; Litman, Ron ; Tsiper, Shahar ; Anschel, Oron ; Manmatha, Raghavan ; Mazor, Shai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US11308354B13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2022</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>HANDLING RECORD CARRIERS</topic><topic>PHYSICS</topic><topic>PRESENTATION OF DATA</topic><topic>RECOGNITION OF DATA</topic><topic>RECORD CARRIERS</topic><toplevel>online_resources</toplevel><creatorcontrib>Litman, Roee</creatorcontrib><creatorcontrib>Wu, Jonathan</creatorcontrib><creatorcontrib>Litman, Ron</creatorcontrib><creatorcontrib>Tsiper, Shahar</creatorcontrib><creatorcontrib>Anschel, Oron</creatorcontrib><creatorcontrib>Manmatha, Raghavan</creatorcontrib><creatorcontrib>Mazor, Shai</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Litman, Roee</au><au>Wu, Jonathan</au><au>Litman, Ron</au><au>Tsiper, Shahar</au><au>Anschel, Oron</au><au>Manmatha, Raghavan</au><au>Mazor, Shai</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Residual context refinement network architecture for optical character recognition</title><date>2022-04-19</date><risdate>2022</risdate><abstract>Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US11308354B1
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
HANDLING RECORD CARRIERS
PHYSICS
PRESENTATION OF DATA
RECOGNITION OF DATA
RECORD CARRIERS
title Residual context refinement network architecture for optical character recognition
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T19%3A08%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Litman,%20Roee&rft.date=2022-04-19&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS11308354B1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true