Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy

The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.105286-105300
Hauptverfasser: Yang, Qiming, Chao, Hongyang, Nguyen, Dan, Jiang, Steve
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 105300
container_issue
container_start_page 105286
container_title IEEE access
container_volume 8
creator Yang, Qiming
Chao, Hongyang
Nguyen, Dan
Jiang, Steve
description The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.
doi_str_mv 10.1109/ACCESS.2020.2999079
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2454442761</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9104998</ieee_id><doaj_id>oai_doaj_org_article_1b545199f3e448fcb51b8997a3a0bea6</doaj_id><sourcerecordid>2454442761</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</originalsourceid><addsrcrecordid>eNpNUctu2zAQFIoGaODkC3IR0LNdviSKvRmukxp1GiBOzsSKXLl0JdGlpBrOLX9e2gqCEiC4nNmd5XKS5IaSGaVEfZkvFsvNZsYIIzOmlCJSfUguGc3VlGc8__hf_Cm57rodiauIUCYvk9d717p2m37zDbg2_dH6Q412i1_TVbMP_i_a9DZAgwcffqdP_gDBdul86GN67wzU9THd9NDaiLuXk9C8hUieqEiEwfRDwPSnb7A1NZwvsc0jWOf7Xxhgf7xKLiqoO7x-OyfJ8-3yafF9un64Wy3m66kRpOinYONcUlWyopAVjCMTtiQ2N1XJLTeyEBIplaSKu6RGFMZyK3KWC1miAcUnyWrUtR52eh9cA-GoPTh9BnzYaghxpho1LTORUaUqjkIUlSkzWhZKSeBASoQ8an0eteIX_Rmw6_XOD6GNz9dMZEIIJnMas_iYZYLvuoDVe1dK9Mk6PVqnT9bpN-ti1c1Y5RDxvUJRIpQq-D_CdJeT</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2454442761</pqid></control><display><type>article</type><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</creator><creatorcontrib>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</creatorcontrib><description>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.2999079</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>3D classification ; Adaptive sampling ; Artificial intelligence ; Cancer ; Data mining ; Data models ; Data processing ; Datasets ; deep learning ; Domains ; Feature extraction ; Model testing ; Nomenclature standardization ; Organs ; Radiation therapy ; radiotherapy ; Semantics ; Standardization ; Task analysis ; Voting</subject><ispartof>IEEE access, 2020, Vol.8, p.105286-105300</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</citedby><cites>FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</cites><orcidid>0000-0002-6104-2322 ; 0000-0002-9505-1649</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9104998$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2101,4023,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Yang, Qiming</creatorcontrib><creatorcontrib>Chao, Hongyang</creatorcontrib><creatorcontrib>Nguyen, Dan</creatorcontrib><creatorcontrib>Jiang, Steve</creatorcontrib><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><title>IEEE access</title><addtitle>Access</addtitle><description>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</description><subject>3D classification</subject><subject>Adaptive sampling</subject><subject>Artificial intelligence</subject><subject>Cancer</subject><subject>Data mining</subject><subject>Data models</subject><subject>Data processing</subject><subject>Datasets</subject><subject>deep learning</subject><subject>Domains</subject><subject>Feature extraction</subject><subject>Model testing</subject><subject>Nomenclature standardization</subject><subject>Organs</subject><subject>Radiation therapy</subject><subject>radiotherapy</subject><subject>Semantics</subject><subject>Standardization</subject><subject>Task analysis</subject><subject>Voting</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUctu2zAQFIoGaODkC3IR0LNdviSKvRmukxp1GiBOzsSKXLl0JdGlpBrOLX9e2gqCEiC4nNmd5XKS5IaSGaVEfZkvFsvNZsYIIzOmlCJSfUguGc3VlGc8__hf_Cm57rodiauIUCYvk9d717p2m37zDbg2_dH6Q412i1_TVbMP_i_a9DZAgwcffqdP_gDBdul86GN67wzU9THd9NDaiLuXk9C8hUieqEiEwfRDwPSnb7A1NZwvsc0jWOf7Xxhgf7xKLiqoO7x-OyfJ8-3yafF9un64Wy3m66kRpOinYONcUlWyopAVjCMTtiQ2N1XJLTeyEBIplaSKu6RGFMZyK3KWC1miAcUnyWrUtR52eh9cA-GoPTh9BnzYaghxpho1LTORUaUqjkIUlSkzWhZKSeBASoQ8an0eteIX_Rmw6_XOD6GNz9dMZEIIJnMas_iYZYLvuoDVe1dK9Mk6PVqnT9bpN-ti1c1Y5RDxvUJRIpQq-D_CdJeT</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Yang, Qiming</creator><creator>Chao, Hongyang</creator><creator>Nguyen, Dan</creator><creator>Jiang, Steve</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6104-2322</orcidid><orcidid>https://orcid.org/0000-0002-9505-1649</orcidid></search><sort><creationdate>2020</creationdate><title>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</title><author>Yang, Qiming ; Chao, Hongyang ; Nguyen, Dan ; Jiang, Steve</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-ad07979f7f1a5823e24db0d6cfb3d3c7847e1170f170b1c48cd3d462647beca93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>3D classification</topic><topic>Adaptive sampling</topic><topic>Artificial intelligence</topic><topic>Cancer</topic><topic>Data mining</topic><topic>Data models</topic><topic>Data processing</topic><topic>Datasets</topic><topic>deep learning</topic><topic>Domains</topic><topic>Feature extraction</topic><topic>Model testing</topic><topic>Nomenclature standardization</topic><topic>Organs</topic><topic>Radiation therapy</topic><topic>radiotherapy</topic><topic>Semantics</topic><topic>Standardization</topic><topic>Task analysis</topic><topic>Voting</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Qiming</creatorcontrib><creatorcontrib>Chao, Hongyang</creatorcontrib><creatorcontrib>Nguyen, Dan</creatorcontrib><creatorcontrib>Jiang, Steve</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Qiming</au><au>Chao, Hongyang</au><au>Nguyen, Dan</au><au>Jiang, Steve</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>105286</spage><epage>105300</epage><pages>105286-105300</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>The automatic standardization of nomenclature for anatomical structures in radiotherapy (RT) clinical data is a critical prerequisite for data curation and data-driven research in the era of big data and artificial intelligence, but it is currently an unmet need. Existing methods either cannot handle cross-institutional datasets or suffer from heavy imbalance and poor-quality delineation in clinical RT datasets. To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV). This framework consists of an improved data processing strategy, namely, adaptive sampling and adaptive cropping (ASAC) with voting, and an optimized feature extraction module. The framework simulates clinicians' domain knowledge and recognition mechanisms to identify small-volume organs at risk (OARs) with heavily imbalanced data better than other methods. We used partial data from an open-source head-and-neck cancer dataset to train the model, then tested the model on three cross-institutional datasets to demonstrate its generalizability. 3DNNV outperformed the baseline model, achieving higher average true positive rates (TPR) over all categories on the three test datasets (+8.27%, +2.39%, and +5.53%, respectively). More importantly, the 3DNNV outperformed the baseline on the test dataset, 28.63% to 91.17%, in terms of F1 score for a small-volume OAR with only 9 training samples. The results show that 3DNNV can be applied to identify OARs, even error-prone ones. Furthermore, we discussed the limitations and applicability of the framework in practical scenarios. The framework we developed can assist in standardizing structure nomenclature to facilitate data-driven clinical research in cancer radiotherapy.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.2999079</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-6104-2322</orcidid><orcidid>https://orcid.org/0000-0002-9505-1649</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2020, Vol.8, p.105286-105300
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2454442761
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects 3D classification
Adaptive sampling
Artificial intelligence
Cancer
Data mining
Data models
Data processing
Datasets
deep learning
Domains
Feature extraction
Model testing
Nomenclature standardization
Organs
Radiation therapy
radiotherapy
Semantics
Standardization
Task analysis
Voting
title Mining Domain Knowledge: Improved Framework Towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T06%3A52%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mining%20Domain%20Knowledge:%20Improved%20Framework%20Towards%20Automatically%20Standardizing%20Anatomical%20Structure%20Nomenclature%20in%20Radiotherapy&rft.jtitle=IEEE%20access&rft.au=Yang,%20Qiming&rft.date=2020&rft.volume=8&rft.spage=105286&rft.epage=105300&rft.pages=105286-105300&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.2999079&rft_dat=%3Cproquest_cross%3E2454442761%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2454442761&rft_id=info:pmid/&rft_ieee_id=9104998&rft_doaj_id=oai_doaj_org_article_1b545199f3e448fcb51b8997a3a0bea6&rfr_iscdi=true