HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection

As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2023-01, Vol.PP, p.1-1
Hauptverfasser: Bi, Hengyue, Xu, Canhui, Shi, Cao, Liu, Guozhu, Zhang, Honghong, Li, Yuteng, Dong, Junyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue
container_start_page 1
container_title IEEE transactions on image processing
container_volume PP
creator Bi, Hengyue
Xu, Canhui
Shi, Cao
Liu, Guozhu
Zhang, Honghong
Li, Yuteng
Dong, Junyu
description As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.
doi_str_mv 10.1109/TIP.2023.3294822
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_37459262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10185179</ieee_id><sourcerecordid>2839253140</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</originalsourceid><addsrcrecordid>eNpdkEtLAzEURoMoPqp7FyIBN26m5jmTuCtVW6Go1Op2yGRu7NR2piZT1H9vpFXE1Q3c833cHISOKelSSvTF5PahywjjXc60UIxtoX2qBU0IEWw7vonMkowKvYcOQpgRQoWk6S7a45mQmqVsHz0PB-PkDtpLPKzAG2-nlTVzPPBmOcVjMKGpq_oFR-K98a_YNR73fFG1Ef3Ej1OzBPxooQY8gY8WX0ELtq2a-hDtODMPcLSZHfR0cz3pD5PR_eC23xsllgvVJpyXzFhbyFRlhZNZwYhUmVapY07JUpXAtFYUUp0VheKSUwAwqVDOuNJRzjvofN279M3bCkKbL6pgYT43NTSrkDPFNYsxQSJ69g-dNStfx-siFfeaqFRFiqwp65sQPLh86atF_GxOSf7tPI_O82_n-cZ5jJxuilfFAsrfwI_kCJysgSpe_6ePKkkzzb8AiX-D4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2840390868</pqid></control><display><type>article</type><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</creator><creatorcontrib>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</creatorcontrib><description>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2023.3294822</identifier><identifier>PMID: 37459262</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>arbitrary shape text ; Cognition ; Couplings ; Feature extraction ; Feature maps ; Graph Convolutional Network ; hierarchical relation modeling ; Layout ; Proposals ; Reasoning ; Scene text detection ; Shape ; Text detection</subject><ispartof>IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</citedby><cites>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</cites><orcidid>0000-0003-2748-5557 ; 0000-0001-7012-2087 ; 0000-0002-4191-3186 ; 0000-0002-1578-3576 ; 0000-0002-8880-4526 ; 0000-0002-7169-7760 ; 0000-0002-9907-6747</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10185179$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10185179$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37459262$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bi, Hengyue</creatorcontrib><creatorcontrib>Xu, Canhui</creatorcontrib><creatorcontrib>Shi, Cao</creatorcontrib><creatorcontrib>Liu, Guozhu</creatorcontrib><creatorcontrib>Zhang, Honghong</creatorcontrib><creatorcontrib>Li, Yuteng</creatorcontrib><creatorcontrib>Dong, Junyu</creatorcontrib><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</description><subject>arbitrary shape text</subject><subject>Cognition</subject><subject>Couplings</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Graph Convolutional Network</subject><subject>hierarchical relation modeling</subject><subject>Layout</subject><subject>Proposals</subject><subject>Reasoning</subject><subject>Scene text detection</subject><subject>Shape</subject><subject>Text detection</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLAzEURoMoPqp7FyIBN26m5jmTuCtVW6Go1Op2yGRu7NR2piZT1H9vpFXE1Q3c833cHISOKelSSvTF5PahywjjXc60UIxtoX2qBU0IEWw7vonMkowKvYcOQpgRQoWk6S7a45mQmqVsHz0PB-PkDtpLPKzAG2-nlTVzPPBmOcVjMKGpq_oFR-K98a_YNR73fFG1Ef3Ej1OzBPxooQY8gY8WX0ELtq2a-hDtODMPcLSZHfR0cz3pD5PR_eC23xsllgvVJpyXzFhbyFRlhZNZwYhUmVapY07JUpXAtFYUUp0VheKSUwAwqVDOuNJRzjvofN279M3bCkKbL6pgYT43NTSrkDPFNYsxQSJ69g-dNStfx-siFfeaqFRFiqwp65sQPLh86atF_GxOSf7tPI_O82_n-cZ5jJxuilfFAsrfwI_kCJysgSpe_6ePKkkzzb8AiX-D4A</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Bi, Hengyue</creator><creator>Xu, Canhui</creator><creator>Shi, Cao</creator><creator>Liu, Guozhu</creator><creator>Zhang, Honghong</creator><creator>Li, Yuteng</creator><creator>Dong, Junyu</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-2748-5557</orcidid><orcidid>https://orcid.org/0000-0001-7012-2087</orcidid><orcidid>https://orcid.org/0000-0002-4191-3186</orcidid><orcidid>https://orcid.org/0000-0002-1578-3576</orcidid><orcidid>https://orcid.org/0000-0002-8880-4526</orcidid><orcidid>https://orcid.org/0000-0002-7169-7760</orcidid><orcidid>https://orcid.org/0000-0002-9907-6747</orcidid></search><sort><creationdate>20230101</creationdate><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><author>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>arbitrary shape text</topic><topic>Cognition</topic><topic>Couplings</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Graph Convolutional Network</topic><topic>hierarchical relation modeling</topic><topic>Layout</topic><topic>Proposals</topic><topic>Reasoning</topic><topic>Scene text detection</topic><topic>Shape</topic><topic>Text detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bi, Hengyue</creatorcontrib><creatorcontrib>Xu, Canhui</creatorcontrib><creatorcontrib>Shi, Cao</creatorcontrib><creatorcontrib>Liu, Guozhu</creatorcontrib><creatorcontrib>Zhang, Honghong</creatorcontrib><creatorcontrib>Li, Yuteng</creatorcontrib><creatorcontrib>Dong, Junyu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bi, Hengyue</au><au>Xu, Canhui</au><au>Shi, Cao</au><au>Liu, Guozhu</au><au>Zhang, Honghong</au><au>Li, Yuteng</au><au>Dong, Junyu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>PP</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>37459262</pmid><doi>10.1109/TIP.2023.3294822</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-2748-5557</orcidid><orcidid>https://orcid.org/0000-0001-7012-2087</orcidid><orcidid>https://orcid.org/0000-0002-4191-3186</orcidid><orcidid>https://orcid.org/0000-0002-1578-3576</orcidid><orcidid>https://orcid.org/0000-0002-8880-4526</orcidid><orcidid>https://orcid.org/0000-0002-7169-7760</orcidid><orcidid>https://orcid.org/0000-0002-9907-6747</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1
issn 1057-7149
1941-0042
language eng
recordid cdi_pubmed_primary_37459262
source IEEE Electronic Library (IEL)
subjects arbitrary shape text
Cognition
Couplings
Feature extraction
Feature maps
Graph Convolutional Network
hierarchical relation modeling
Layout
Proposals
Reasoning
Scene text detection
Shape
Text detection
title HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T16%3A28%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HGR-Net:%20Hierarchical%20Graph%20Reasoning%20Network%20for%20Arbitrary%20Shape%20Scene%20Text%20Detection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Bi,%20Hengyue&rft.date=2023-01-01&rft.volume=PP&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2023.3294822&rft_dat=%3Cproquest_RIE%3E2839253140%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2840390868&rft_id=info:pmid/37459262&rft_ieee_id=10185179&rfr_iscdi=true