Text Enhancement Network for Cross-domain Scene Text Detection
Conventional scene text detection approaches essentially assume that training and test data are drawn from the same distribution and have achieved compelling results. However, scene text detectors often suffer from performance degradation in real-world applications, since the feature distribution of...
Gespeichert in:
Veröffentlicht in: | IEEE signal processing letters 2022, Vol.29, p.1-5 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE signal processing letters |
container_volume | 29 |
creator | Deng, Jinhong Luo, Xiulian Zheng, Jiawen Dang, Wanli Li, Wen |
description | Conventional scene text detection approaches essentially assume that training and test data are drawn from the same distribution and have achieved compelling results. However, scene text detectors often suffer from performance degradation in real-world applications, since the feature distribution of training images is different from that of test images obtained from a new scene. To address the above problems, we propose a novel method called Text Enhancement Network (TEN) based on adversarial learning for cross-domain scene text detection. Specifically, we first design a Multi-adversarial Feature Alignment (MFA) module to maximally align features of the source and target data from low-level texture to high-level semantics. Second, we develop the Text Attention Enhancement (TAE) module to re-weigh the importance of text regions and accordingly enhance the corresponding features, in order to improve the robustness against noisy background. Additionally, we design a self-training strategy to further boost the performance of our TEN. We conduct extensive experiments on five benchmarks, and the experimental results demonstrate the effectiveness of our TEN. |
doi_str_mv | 10.1109/LSP.2022.3214155 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2731856918</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9917319</ieee_id><sourcerecordid>2731856918</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-5f589626c860b0223492f0cbde4306f35c42e30263481e07680a8e3f5543a46b3</originalsourceid><addsrcrecordid>eNo9kM1LAzEQxYMoWKt3wcuC562TZJNNLoLU-gGLCq3nsE0nuNUmNZui_vemtniZmcN7M29-hJxTGFEK-qqZvowYMDbijFZUiAMyyFWVjEt6mGeoodQa1DE56fslACiqxIBcz_A7FRP_1nqLK_SpeML0FeJ74UIsxjH0fbkIq7bzxdSix-JPf4sJbeqCPyVHrv3o8Wzfh-T1bjIbP5TN8_3j-KYpLRd1KoUTSksmrZIwzyF5pZkDO19gxUE6LmzFkAOTvFIUoZYKWoXcCVHxtpJzPiSXu73rGD432CezDJvo80nDap4_kZqqrIKdym5zR3RmHbtVG38MBbOlZDIls6Vk9pSy5WJn6RDxX641zVs1_wW2fmCe</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2731856918</pqid></control><display><type>article</type><title>Text Enhancement Network for Cross-domain Scene Text Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Deng, Jinhong ; Luo, Xiulian ; Zheng, Jiawen ; Dang, Wanli ; Li, Wen</creator><creatorcontrib>Deng, Jinhong ; Luo, Xiulian ; Zheng, Jiawen ; Dang, Wanli ; Li, Wen</creatorcontrib><description>Conventional scene text detection approaches essentially assume that training and test data are drawn from the same distribution and have achieved compelling results. However, scene text detectors often suffer from performance degradation in real-world applications, since the feature distribution of training images is different from that of test images obtained from a new scene. To address the above problems, we propose a novel method called Text Enhancement Network (TEN) based on adversarial learning for cross-domain scene text detection. Specifically, we first design a Multi-adversarial Feature Alignment (MFA) module to maximally align features of the source and target data from low-level texture to high-level semantics. Second, we develop the Text Attention Enhancement (TAE) module to re-weigh the importance of text regions and accordingly enhance the corresponding features, in order to improve the robustness against noisy background. Additionally, we design a self-training strategy to further boost the performance of our TEN. We conduct extensive experiments on five benchmarks, and the experimental results demonstrate the effectiveness of our TEN.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2022.3214155</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Background noise ; Convolution ; Detectors ; Domain adaptation ; Domains ; Feature extraction ; Geometry ; Modules ; Object detection ; Performance degradation ; Scene text detection ; Semantics ; Training</subject><ispartof>IEEE signal processing letters, 2022, Vol.29, p.1-5</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-5f589626c860b0223492f0cbde4306f35c42e30263481e07680a8e3f5543a46b3</citedby><cites>FETCH-LOGICAL-c357t-5f589626c860b0223492f0cbde4306f35c42e30263481e07680a8e3f5543a46b3</cites><orcidid>0000-0002-5559-8594 ; 0000-0002-9108-737X ; 0000-0003-0939-0669</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9917319$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9917319$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Deng, Jinhong</creatorcontrib><creatorcontrib>Luo, Xiulian</creatorcontrib><creatorcontrib>Zheng, Jiawen</creatorcontrib><creatorcontrib>Dang, Wanli</creatorcontrib><creatorcontrib>Li, Wen</creatorcontrib><title>Text Enhancement Network for Cross-domain Scene Text Detection</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>Conventional scene text detection approaches essentially assume that training and test data are drawn from the same distribution and have achieved compelling results. However, scene text detectors often suffer from performance degradation in real-world applications, since the feature distribution of training images is different from that of test images obtained from a new scene. To address the above problems, we propose a novel method called Text Enhancement Network (TEN) based on adversarial learning for cross-domain scene text detection. Specifically, we first design a Multi-adversarial Feature Alignment (MFA) module to maximally align features of the source and target data from low-level texture to high-level semantics. Second, we develop the Text Attention Enhancement (TAE) module to re-weigh the importance of text regions and accordingly enhance the corresponding features, in order to improve the robustness against noisy background. Additionally, we design a self-training strategy to further boost the performance of our TEN. We conduct extensive experiments on five benchmarks, and the experimental results demonstrate the effectiveness of our TEN.</description><subject>Background noise</subject><subject>Convolution</subject><subject>Detectors</subject><subject>Domain adaptation</subject><subject>Domains</subject><subject>Feature extraction</subject><subject>Geometry</subject><subject>Modules</subject><subject>Object detection</subject><subject>Performance degradation</subject><subject>Scene text detection</subject><subject>Semantics</subject><subject>Training</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM1LAzEQxYMoWKt3wcuC562TZJNNLoLU-gGLCq3nsE0nuNUmNZui_vemtniZmcN7M29-hJxTGFEK-qqZvowYMDbijFZUiAMyyFWVjEt6mGeoodQa1DE56fslACiqxIBcz_A7FRP_1nqLK_SpeML0FeJ74UIsxjH0fbkIq7bzxdSix-JPf4sJbeqCPyVHrv3o8Wzfh-T1bjIbP5TN8_3j-KYpLRd1KoUTSksmrZIwzyF5pZkDO19gxUE6LmzFkAOTvFIUoZYKWoXcCVHxtpJzPiSXu73rGD432CezDJvo80nDap4_kZqqrIKdym5zR3RmHbtVG38MBbOlZDIls6Vk9pSy5WJn6RDxX641zVs1_wW2fmCe</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Deng, Jinhong</creator><creator>Luo, Xiulian</creator><creator>Zheng, Jiawen</creator><creator>Dang, Wanli</creator><creator>Li, Wen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5559-8594</orcidid><orcidid>https://orcid.org/0000-0002-9108-737X</orcidid><orcidid>https://orcid.org/0000-0003-0939-0669</orcidid></search><sort><creationdate>2022</creationdate><title>Text Enhancement Network for Cross-domain Scene Text Detection</title><author>Deng, Jinhong ; Luo, Xiulian ; Zheng, Jiawen ; Dang, Wanli ; Li, Wen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-5f589626c860b0223492f0cbde4306f35c42e30263481e07680a8e3f5543a46b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Background noise</topic><topic>Convolution</topic><topic>Detectors</topic><topic>Domain adaptation</topic><topic>Domains</topic><topic>Feature extraction</topic><topic>Geometry</topic><topic>Modules</topic><topic>Object detection</topic><topic>Performance degradation</topic><topic>Scene text detection</topic><topic>Semantics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Deng, Jinhong</creatorcontrib><creatorcontrib>Luo, Xiulian</creatorcontrib><creatorcontrib>Zheng, Jiawen</creatorcontrib><creatorcontrib>Dang, Wanli</creatorcontrib><creatorcontrib>Li, Wen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Deng, Jinhong</au><au>Luo, Xiulian</au><au>Zheng, Jiawen</au><au>Dang, Wanli</au><au>Li, Wen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Text Enhancement Network for Cross-domain Scene Text Detection</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2022</date><risdate>2022</risdate><volume>29</volume><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>Conventional scene text detection approaches essentially assume that training and test data are drawn from the same distribution and have achieved compelling results. However, scene text detectors often suffer from performance degradation in real-world applications, since the feature distribution of training images is different from that of test images obtained from a new scene. To address the above problems, we propose a novel method called Text Enhancement Network (TEN) based on adversarial learning for cross-domain scene text detection. Specifically, we first design a Multi-adversarial Feature Alignment (MFA) module to maximally align features of the source and target data from low-level texture to high-level semantics. Second, we develop the Text Attention Enhancement (TAE) module to re-weigh the importance of text regions and accordingly enhance the corresponding features, in order to improve the robustness against noisy background. Additionally, we design a self-training strategy to further boost the performance of our TEN. We conduct extensive experiments on five benchmarks, and the experimental results demonstrate the effectiveness of our TEN.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2022.3214155</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-5559-8594</orcidid><orcidid>https://orcid.org/0000-0002-9108-737X</orcidid><orcidid>https://orcid.org/0000-0003-0939-0669</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1070-9908 |
ispartof | IEEE signal processing letters, 2022, Vol.29, p.1-5 |
issn | 1070-9908 1558-2361 |
language | eng |
recordid | cdi_proquest_journals_2731856918 |
source | IEEE Electronic Library (IEL) |
subjects | Background noise Convolution Detectors Domain adaptation Domains Feature extraction Geometry Modules Object detection Performance degradation Scene text detection Semantics Training |
title | Text Enhancement Network for Cross-domain Scene Text Detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T14%3A14%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Text%20Enhancement%20Network%20for%20Cross-domain%20Scene%20Text%20Detection&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Deng,%20Jinhong&rft.date=2022&rft.volume=29&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2022.3214155&rft_dat=%3Cproquest_RIE%3E2731856918%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2731856918&rft_id=info:pmid/&rft_ieee_id=9917319&rfr_iscdi=true |