HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection

As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, w...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2023-01, Vol.PP, p.1-1
Hauptverfasser:	Bi, Hengyue, Xu, Canhui, Shi, Cao, Liu, Guozhu, Zhang, Honghong, Li, Yuteng, Dong, Junyu
Format:	Artikel
Sprache:	eng
Schlagworte:	arbitrary shape text Cognition Couplings Feature extraction Feature maps Graph Convolutional Network hierarchical relation modeling Layout Proposals Reasoning Scene text detection Shape Text detection
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on image processing
container_volume	PP
creator	Bi, Hengyue Xu, Canhui Shi, Cao Liu, Guozhu Zhang, Honghong Li, Yuteng Dong, Junyu
description	As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.
doi_str_mv	10.1109/TIP.2023.3294822
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_37459262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10185179</ieee_id><sourcerecordid>2839253140</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</originalsourceid><addsrcrecordid>eNpdkEtLAzEURoMoPqp7FyIBN26m5jmTuCtVW6Go1Op2yGRu7NR2piZT1H9vpFXE1Q3c833cHISOKelSSvTF5PahywjjXc60UIxtoX2qBU0IEWw7vonMkowKvYcOQpgRQoWk6S7a45mQmqVsHz0PB-PkDtpLPKzAG2-nlTVzPPBmOcVjMKGpq_oFR-K98a_YNR73fFG1Ef3Ej1OzBPxooQY8gY8WX0ELtq2a-hDtODMPcLSZHfR0cz3pD5PR_eC23xsllgvVJpyXzFhbyFRlhZNZwYhUmVapY07JUpXAtFYUUp0VheKSUwAwqVDOuNJRzjvofN279M3bCkKbL6pgYT43NTSrkDPFNYsxQSJ69g-dNStfx-siFfeaqFRFiqwp65sQPLh86atF_GxOSf7tPI_O82_n-cZ5jJxuilfFAsrfwI_kCJysgSpe_6ePKkkzzb8AiX-D4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2840390868</pqid></control><display><type>article</type><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</creator><creatorcontrib>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</creatorcontrib><description>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2023.3294822</identifier><identifier>PMID: 37459262</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>arbitrary shape text ; Cognition ; Couplings ; Feature extraction ; Feature maps ; Graph Convolutional Network ; hierarchical relation modeling ; Layout ; Proposals ; Reasoning ; Scene text detection ; Shape ; Text detection</subject><ispartof>IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</citedby><cites>FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</cites><orcidid>0000-0003-2748-5557 ; 0000-0001-7012-2087 ; 0000-0002-4191-3186 ; 0000-0002-1578-3576 ; 0000-0002-8880-4526 ; 0000-0002-7169-7760 ; 0000-0002-9907-6747</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10185179$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10185179$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37459262$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bi, Hengyue</creatorcontrib><creatorcontrib>Xu, Canhui</creatorcontrib><creatorcontrib>Shi, Cao</creatorcontrib><creatorcontrib>Liu, Guozhu</creatorcontrib><creatorcontrib>Zhang, Honghong</creatorcontrib><creatorcontrib>Li, Yuteng</creatorcontrib><creatorcontrib>Dong, Junyu</creatorcontrib><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</description><subject>arbitrary shape text</subject><subject>Cognition</subject><subject>Couplings</subject><subject>Feature extraction</subject><subject>Feature maps</subject><subject>Graph Convolutional Network</subject><subject>hierarchical relation modeling</subject><subject>Layout</subject><subject>Proposals</subject><subject>Reasoning</subject><subject>Scene text detection</subject><subject>Shape</subject><subject>Text detection</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkEtLAzEURoMoPqp7FyIBN26m5jmTuCtVW6Go1Op2yGRu7NR2piZT1H9vpFXE1Q3c833cHISOKelSSvTF5PahywjjXc60UIxtoX2qBU0IEWw7vonMkowKvYcOQpgRQoWk6S7a45mQmqVsHz0PB-PkDtpLPKzAG2-nlTVzPPBmOcVjMKGpq_oFR-K98a_YNR73fFG1Ef3Ej1OzBPxooQY8gY8WX0ELtq2a-hDtODMPcLSZHfR0cz3pD5PR_eC23xsllgvVJpyXzFhbyFRlhZNZwYhUmVapY07JUpXAtFYUUp0VheKSUwAwqVDOuNJRzjvofN279M3bCkKbL6pgYT43NTSrkDPFNYsxQSJ69g-dNStfx-siFfeaqFRFiqwp65sQPLh86atF_GxOSf7tPI_O82_n-cZ5jJxuilfFAsrfwI_kCJysgSpe_6ePKkkzzb8AiX-D4A</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Bi, Hengyue</creator><creator>Xu, Canhui</creator><creator>Shi, Cao</creator><creator>Liu, Guozhu</creator><creator>Zhang, Honghong</creator><creator>Li, Yuteng</creator><creator>Dong, Junyu</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-2748-5557</orcidid><orcidid>https://orcid.org/0000-0001-7012-2087</orcidid><orcidid>https://orcid.org/0000-0002-4191-3186</orcidid><orcidid>https://orcid.org/0000-0002-1578-3576</orcidid><orcidid>https://orcid.org/0000-0002-8880-4526</orcidid><orcidid>https://orcid.org/0000-0002-7169-7760</orcidid><orcidid>https://orcid.org/0000-0002-9907-6747</orcidid></search><sort><creationdate>20230101</creationdate><title>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</title><author>Bi, Hengyue ; Xu, Canhui ; Shi, Cao ; Liu, Guozhu ; Zhang, Honghong ; Li, Yuteng ; Dong, Junyu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-33d2accb5687bf57b20587986f2f85d8de29981e697bb83531eeea648fafdf133</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>arbitrary shape text</topic><topic>Cognition</topic><topic>Couplings</topic><topic>Feature extraction</topic><topic>Feature maps</topic><topic>Graph Convolutional Network</topic><topic>hierarchical relation modeling</topic><topic>Layout</topic><topic>Proposals</topic><topic>Reasoning</topic><topic>Scene text detection</topic><topic>Shape</topic><topic>Text detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bi, Hengyue</creatorcontrib><creatorcontrib>Xu, Canhui</creatorcontrib><creatorcontrib>Shi, Cao</creatorcontrib><creatorcontrib>Liu, Guozhu</creatorcontrib><creatorcontrib>Zhang, Honghong</creatorcontrib><creatorcontrib>Li, Yuteng</creatorcontrib><creatorcontrib>Dong, Junyu</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bi, Hengyue</au><au>Xu, Canhui</au><au>Shi, Cao</au><au>Liu, Guozhu</au><au>Zhang, Honghong</au><au>Li, Yuteng</au><au>Dong, Junyu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>PP</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>As a prerequisite step of scene text reading, scene text detection is known as a challenging task due to natural scene text diversity and variability. Most existing methods either adopt bottom-up sub-text component extraction or focus on top-down text contour regression. From a hybrid perspective, we explore hierarchical text instance-level and component-level representation for arbitrarily-shaped scene text detection. In this work, we propose a novel Hierarchical Graph Reasoning Network (HGR-Net), which consists of a Text Feature Extraction Network (TFEN) and a Text Relation Learner Network (TRLN). TFEN adaptively learns multi-grained text candidates based on shared convolutional feature maps, including instance-level text contours and component-level quadrangles. In TRLN, an inter-text graph is constructed to explore global contextual information with position-awareness between text instances, and an intra-text graph is designed to estimate geometric attributes for establishing component-level linkages. Next, we bridge the cross-feed interaction between instance-level and component-level, and it further achieves hierarchical relational reasoning by learning complementary graph embeddings across levels. Experiments conducted on three publicly available benchmarks SCUT-CTW1500, Total-Text, and ICDAR15 have demonstrated that HGR-Net achieves state-of-the-art performance on arbitrary orientation and arbitrary shape scene text detection.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>37459262</pmid><doi>10.1109/TIP.2023.3294822</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-2748-5557</orcidid><orcidid>https://orcid.org/0000-0001-7012-2087</orcidid><orcidid>https://orcid.org/0000-0002-4191-3186</orcidid><orcidid>https://orcid.org/0000-0002-1578-3576</orcidid><orcidid>https://orcid.org/0000-0002-8880-4526</orcidid><orcidid>https://orcid.org/0000-0002-7169-7760</orcidid><orcidid>https://orcid.org/0000-0002-9907-6747</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2023-01, Vol.PP, p.1-1
issn	1057-7149 1941-0042
language	eng
recordid	cdi_pubmed_primary_37459262
source	IEEE Electronic Library (IEL)
subjects	arbitrary shape text Cognition Couplings Feature extraction Feature maps Graph Convolutional Network hierarchical relation modeling Layout Proposals Reasoning Scene text detection Shape Text detection
title	HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T16%3A28%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HGR-Net:%20Hierarchical%20Graph%20Reasoning%20Network%20for%20Arbitrary%20Shape%20Scene%20Text%20Detection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Bi,%20Hengyue&rft.date=2023-01-01&rft.volume=PP&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2023.3294822&rft_dat=%3Cproquest_RIE%3E2839253140%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2840390868&rft_id=info:pmid/37459262&rft_ieee_id=10185179&rfr_iscdi=true