Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy
The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handl...
Gespeichert in:
Veröffentlicht in: | IEEE access 2020, Vol.8, p.105286-105300 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 105300 |
---|---|
container_issue | |
container_start_page | 105286 |
container_title | IEEE access |
container_volume | 8 |
creator | Yang, Qiming Chao, Hongyang Nguyen, Dan Jiang, Steve |
description | The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy. |
doi_str_mv | 10.1109/ACCESS.2020.2999079 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2454442761</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9104998</ieee_id><doaj_id>oai_doaj_org_article_1b545199f3e448fcb51b8997a3a0bea6</doaj_id><sourcerecordid>2454442761</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</originalsourceid><addsrcrecordid>eNpNUctu2zAQFIoGaODkC3IR0LNdviSKvRmukxp1GiBOzsSKXLl0JdGlpBrOLX9e2gqCEiC4nNmd5XKS5IaSGaVEfZkvFsvNZsYIIzOmlCJSfUguGc3VlGc8__hf_Cm57rodiauIUCYvk9d717p2m37zDbg2_dH6Q412i1_TVbMP_i_a9DZAgwcffqdP_gDBdul86GN67wzU9THd9NDaiLuXk9C8hUieqEiEwfRDwPSnb7A1NZwvsc0jWOf7Xxhgf7xKLiqoO7x-OyfJ8-3yafF9un64Wy3m66kRpOinYONcUlWyopAVjCMTtiQ2N1XJLTeyEBIplaSKu6RGFMZyK3KWC1miAcUnyWrUtR52eh9cA-GoPTh9BnzYaghxpho1LTORUaUqjkIUlSkzWhZKSeBASoQ8an0eteIX_Rmw6_XOD6GNz9dMZEIIJnMas_iYZYLvuoDVe1dK9Mk6PVqnT9bpN-ti1c1Y5RDxvUJRIpQq-D_CdJeT</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2454442761</pqid></control><display><type>article</type><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</creator><creatorcontrib>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</creatorcontrib><description>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.2999079</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>3D classification ; Adaptive sampling ; Artificial intelligence ; Cancer ; Data mining ; Data models ; Data processing ; Datasets ; deep learning ; Domains ; Feature extraction ; Model testing ; Nomenclature standardization ; Organs ; Radiation therapy ; radiotherapy ; Semantics ; Standardization ; Task analysis ; Voting</subject><ispartof>IEEE access, 2020, Vol.8, p.105286-105300</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</citedby><cites>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</cites><orcidid>0000-0002-6104-2322 ; 0000-0002-9505-1649</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9104998$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2101,4023,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Yang, Qiming</creatorcontrib><creatorcontrib>Chao, Hongyang</creatorcontrib><creatorcontrib>Nguyen, Dan</creatorcontrib><creatorcontrib>Jiang, Steve</creatorcontrib><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><title>IEEE access</title><addtitle>Access</addtitle><description>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</description><subject>3D classification</subject><subject>Adaptive sampling</subject><subject>Artificial intelligence</subject><subject>Cancer</subject><subject>Data mining</subject><subject>Data models</subject><subject>Data processing</subject><subject>Datasets</subject><subject>deep learning</subject><subject>Domains</subject><subject>Feature extraction</subject><subject>Model testing</subject><subject>Nomenclature standardization</subject><subject>Organs</subject><subject>Radiation therapy</subject><subject>radiotherapy</subject><subject>Semantics</subject><subject>Standardization</subject><subject>Task analysis</subject><subject>Voting</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUctu2zAQFIoGaODkC3IR0LNdviSKvRmukxp1GiBOzsSKXLl0JdGlpBrOLX9e2gqCEiC4nNmd5XKS5IaSGaVEfZkvFsvNZsYIIzOmlCJSfUguGc3VlGc8__hf_Cm57rodiauIUCYvk9d717p2m37zDbg2_dH6Q412i1_TVbMP_i_a9DZAgwcffqdP_gDBdul86GN67wzU9THd9NDaiLuXk9C8hUieqEiEwfRDwPSnb7A1NZwvsc0jWOf7Xxhgf7xKLiqoO7x-OyfJ8-3yafF9un64Wy3m66kRpOinYONcUlWyopAVjCMTtiQ2N1XJLTeyEBIplaSKu6RGFMZyK3KWC1miAcUnyWrUtR52eh9cA-GoPTh9BnzYaghxpho1LTORUaUqjkIUlSkzWhZKSeBASoQ8an0eteIX_Rmw6_XOD6GNz9dMZEIIJnMas_iYZYLvuoDVe1dK9Mk6PVqnT9bpN-ti1c1Y5RDxvUJRIpQq-D_CdJeT</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Yang, Qiming</creator><creator>Chao, Hongyang</creator><creator>Nguyen, Dan</creator><creator>Jiang, Steve</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6104-2322</orcidid><orcidid>https://orcid.org/0000-0002-9505-1649</orcidid></search><sort><creationdate>2020</creationdate><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><author>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>3D classification</topic><topic>Adaptive sampling</topic><topic>Artificial intelligence</topic><topic>Cancer</topic><topic>Data mining</topic><topic>Data models</topic><topic>Data processing</topic><topic>Datasets</topic><topic>deep learning</topic><topic>Domains</topic><topic>Feature extraction</topic><topic>Model testing</topic><topic>Nomenclature standardization</topic><topic>Organs</topic><topic>Radiation therapy</topic><topic>radiotherapy</topic><topic>Semantics</topic><topic>Standardization</topic><topic>Task analysis</topic><topic>Voting</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Qiming</creatorcontrib><creatorcontrib>Chao, Hongyang</creatorcontrib><creatorcontrib>Nguyen, Dan</creatorcontrib><creatorcontrib>Jiang, Steve</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Qiming</au><au>Chao, Hongyang</au><au>Nguyen, Dan</au><au>Jiang, Steve</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>105286</spage><epage>105300</epage><pages>105286-105300</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.2999079</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-6104-2322</orcidid><orcidid>https://orcid.org/0000-0002-9505-1649</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2020, Vol.8, p.105286-105300 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2454442761 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | 3D classification Adaptive sampling Artificial intelligence Cancer Data mining Data models Data processing Datasets deep learning Domains Feature extraction Model testing Nomenclature standardization Organs Radiation therapy radiotherapy Semantics Standardization Task analysis Voting |
title | Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T06%3A52%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mining%20Domain%20Knowledge:%20Improved%20Framework%20Towards%20Automatically%20Standardizing%20Anatomical%20Structure%20Nomenclature%20in%20Radiotherapy&rft.jtitle=IEEE%20access&rft.au=Yang,%20Qiming&rft.date=2020&rft.volume=8&rft.spage=105286&rft.epage=105300&rft.pages=105286-105300&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.2999079&rft_dat=%3Cproquest_cross%3E2454442761%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2454442761&rft_id=info:pmid/&rft_ieee_id=9104998&rft_doaj_id=oai_doaj_org_article_1b545199f3e448fcb51b8997a3a0bea6&rfr_iscdi=true |