Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative

Rationale and Objectives The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) is the largest publicly available computed tomography (CT) image reference data set of lung nodules. In this article, a comprehensive data analysis of the data set and a uniform data mode...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Academic radiology 2015-04, Vol.22 (4), p.488-495
Hauptverfasser: Wang, Weisheng, PhD, Luo, Jiawei, PhD, Yang, Xuedong, PhD, Lin, Hongli, PhD
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 495
container_issue 4
container_start_page 488
container_title Academic radiology
container_volume 22
creator Wang, Weisheng, PhD
Luo, Jiawei, PhD
Yang, Xuedong, PhD
Lin, Hongli, PhD
description Rationale and Objectives The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) is the largest publicly available computed tomography (CT) image reference data set of lung nodules. In this article, a comprehensive data analysis of the data set and a uniform data model are presented with the purpose of facilitating potential researchers to have an in-depth understanding to and efficient use of the data set in their lung cancer–related investigations. Materials and Methods A uniform data model was designed for representation and organization of various types of information contained in different source data files. A software tool was developed for the processing and analysis of the database, which 1) automatically aligns and graphically displays the nodule outlines marked manually by radiologists onto the corresponding CT images; 2) extracts diagnostic nodule characteristics annotated by radiologists; 3) calculates a variety of nodule image features based on the outlines of nodules, including diameter, volume, and degree of roundness, and so forth; 4) integrates all the extracted nodule information into the uniform data model and stores it in a common and easy-to-access data format; and 5) analyzes and summarizes various feature distributions of nodules in several different categories. Using this data processing and analysis tool, all 1018 CT scans from the data set were processed and analyzed for their statistical distribution. Results The information contained in different source data files with different formats was extracted and integrated into a new and uniform data model. Based on the new data model, the statistical distributions of nodules in terms of nodule geometric features and diagnostic characteristics were summarized. In the LIDC/IDRI data set, 2655 nodules ≥3 mm, 5875 nodules
doi_str_mv 10.1016/j.acra.2014.12.004
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1662428015</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1076633214004620</els_id><sourcerecordid>1662428015</sourcerecordid><originalsourceid>FETCH-LOGICAL-c481t-655dcb774ff9ff65b762f64bbd1e9b87ad8e2491248caeb1c24acdd84a07235e3</originalsourceid><addsrcrecordid>eNp9kc1r3DAQxUVpSTYf_0APwcde7EqyLGshBJZtky4sBNLmFhBjabzVxh-pZAf2v4-c3SbQQ08zMO89mN8j5DOjGaNMft1mYDxknDKRMZ5RKj6QGVOlSgUV8mPcaSlTmef8mJyEsKWUFVLlR-SYF5KynMoZefgGAySLDppdcCHp62T4jcl67DbJqoWNi3NSVBAwWfZd6P3gxjaBzr7e8f16h6EfvcFk1bnBweCe8Yx8qqEJeH6Yp-T--vuv5Y90fXuzWi7WqRGKDaksCmuqshR1Pa9rWVSl5LUUVWUZzitVglXIxZxxoQxgxQwXYKxVAmjJ8wLzU_Jln_vk-z8jhkG3LhhsGuiwH4NmUnLBVXw_SvleanwfgsdaP3nXgt9pRvVEVW_1RFVPVDXjOlKNpotD_li1aN8sfzFGweVegPHLZ4deB-OwM2idRzNo27v_51_9YzeN65yB5hF3GLaRaywo_qFDNOifU69TrUxEt-Q0fwGP451I</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1662428015</pqid></control><display><type>article</type><title>Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Wang, Weisheng, PhD ; Luo, Jiawei, PhD ; Yang, Xuedong, PhD ; Lin, Hongli, PhD</creator><creatorcontrib>Wang, Weisheng, PhD ; Luo, Jiawei, PhD ; Yang, Xuedong, PhD ; Lin, Hongli, PhD</creatorcontrib><description>Rationale and Objectives The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) is the largest publicly available computed tomography (CT) image reference data set of lung nodules. In this article, a comprehensive data analysis of the data set and a uniform data model are presented with the purpose of facilitating potential researchers to have an in-depth understanding to and efficient use of the data set in their lung cancer–related investigations. Materials and Methods A uniform data model was designed for representation and organization of various types of information contained in different source data files. A software tool was developed for the processing and analysis of the database, which 1) automatically aligns and graphically displays the nodule outlines marked manually by radiologists onto the corresponding CT images; 2) extracts diagnostic nodule characteristics annotated by radiologists; 3) calculates a variety of nodule image features based on the outlines of nodules, including diameter, volume, and degree of roundness, and so forth; 4) integrates all the extracted nodule information into the uniform data model and stores it in a common and easy-to-access data format; and 5) analyzes and summarizes various feature distributions of nodules in several different categories. Using this data processing and analysis tool, all 1018 CT scans from the data set were processed and analyzed for their statistical distribution. Results The information contained in different source data files with different formats was extracted and integrated into a new and uniform data model. Based on the new data model, the statistical distributions of nodules in terms of nodule geometric features and diagnostic characteristics were summarized. In the LIDC/IDRI data set, 2655 nodules ≥3 mm, 5875 nodules &lt;3 mm, and 7411 non-nodules are identified, respectively. Among the 2655 nodules, 1) 775, 488, 481, and 911 were marked by one, two, three, or four radiologists, respectively; 2) most of nodules ≥3 mm (85.7%) have a diameter &lt;10.0 mm with the mean value of 6.72 mm; and 3) 10.87%, 31.4%, 38.8%, 16.4%, and 2.6% of nodules were assessed with a malignancy score of 1, 2, 3, 4, and 5, respectively. Conclusions This study demonstrates the usefulness of the proposed software tool to the potential users for an in-depth understanding of the LIDC/IDRI data set, therefore likely to be beneficial to their future investigations. The analysis results also demonstrate the distribution diversity of nodules characteristics, therefore being useful as a reference resource for assessing the performance of a new and existing nodule detection and/or segmentation schemes.</description><identifier>ISSN: 1076-6332</identifier><identifier>EISSN: 1878-4046</identifier><identifier>DOI: 10.1016/j.acra.2014.12.004</identifier><identifier>PMID: 25601306</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Databases, Factual - statistics &amp; numerical data ; Humans ; Imaging, Three-Dimensional ; LIDC/DIRI database ; Lung - diagnostic imaging ; Lung Neoplasms - diagnostic imaging ; lung nodule ; quantitative analysis ; Radiographic Image Interpretation, Computer-Assisted ; Radiology ; Radiology Information Systems - statistics &amp; numerical data ; Software ; Solitary Pulmonary Nodule - diagnostic imaging ; Tomography, X-Ray Computed - statistics &amp; numerical data</subject><ispartof>Academic radiology, 2015-04, Vol.22 (4), p.488-495</ispartof><rights>AUR</rights><rights>2015 AUR</rights><rights>Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c481t-655dcb774ff9ff65b762f64bbd1e9b87ad8e2491248caeb1c24acdd84a07235e3</citedby><cites>FETCH-LOGICAL-c481t-655dcb774ff9ff65b762f64bbd1e9b87ad8e2491248caeb1c24acdd84a07235e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1076633214004620$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25601306$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Weisheng, PhD</creatorcontrib><creatorcontrib>Luo, Jiawei, PhD</creatorcontrib><creatorcontrib>Yang, Xuedong, PhD</creatorcontrib><creatorcontrib>Lin, Hongli, PhD</creatorcontrib><title>Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative</title><title>Academic radiology</title><addtitle>Acad Radiol</addtitle><description>Rationale and Objectives The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) is the largest publicly available computed tomography (CT) image reference data set of lung nodules. In this article, a comprehensive data analysis of the data set and a uniform data model are presented with the purpose of facilitating potential researchers to have an in-depth understanding to and efficient use of the data set in their lung cancer–related investigations. Materials and Methods A uniform data model was designed for representation and organization of various types of information contained in different source data files. A software tool was developed for the processing and analysis of the database, which 1) automatically aligns and graphically displays the nodule outlines marked manually by radiologists onto the corresponding CT images; 2) extracts diagnostic nodule characteristics annotated by radiologists; 3) calculates a variety of nodule image features based on the outlines of nodules, including diameter, volume, and degree of roundness, and so forth; 4) integrates all the extracted nodule information into the uniform data model and stores it in a common and easy-to-access data format; and 5) analyzes and summarizes various feature distributions of nodules in several different categories. Using this data processing and analysis tool, all 1018 CT scans from the data set were processed and analyzed for their statistical distribution. Results The information contained in different source data files with different formats was extracted and integrated into a new and uniform data model. Based on the new data model, the statistical distributions of nodules in terms of nodule geometric features and diagnostic characteristics were summarized. In the LIDC/IDRI data set, 2655 nodules ≥3 mm, 5875 nodules &lt;3 mm, and 7411 non-nodules are identified, respectively. Among the 2655 nodules, 1) 775, 488, 481, and 911 were marked by one, two, three, or four radiologists, respectively; 2) most of nodules ≥3 mm (85.7%) have a diameter &lt;10.0 mm with the mean value of 6.72 mm; and 3) 10.87%, 31.4%, 38.8%, 16.4%, and 2.6% of nodules were assessed with a malignancy score of 1, 2, 3, 4, and 5, respectively. Conclusions This study demonstrates the usefulness of the proposed software tool to the potential users for an in-depth understanding of the LIDC/IDRI data set, therefore likely to be beneficial to their future investigations. The analysis results also demonstrate the distribution diversity of nodules characteristics, therefore being useful as a reference resource for assessing the performance of a new and existing nodule detection and/or segmentation schemes.</description><subject>Databases, Factual - statistics &amp; numerical data</subject><subject>Humans</subject><subject>Imaging, Three-Dimensional</subject><subject>LIDC/DIRI database</subject><subject>Lung - diagnostic imaging</subject><subject>Lung Neoplasms - diagnostic imaging</subject><subject>lung nodule</subject><subject>quantitative analysis</subject><subject>Radiographic Image Interpretation, Computer-Assisted</subject><subject>Radiology</subject><subject>Radiology Information Systems - statistics &amp; numerical data</subject><subject>Software</subject><subject>Solitary Pulmonary Nodule - diagnostic imaging</subject><subject>Tomography, X-Ray Computed - statistics &amp; numerical data</subject><issn>1076-6332</issn><issn>1878-4046</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kc1r3DAQxUVpSTYf_0APwcde7EqyLGshBJZtky4sBNLmFhBjabzVxh-pZAf2v4-c3SbQQ08zMO89mN8j5DOjGaNMft1mYDxknDKRMZ5RKj6QGVOlSgUV8mPcaSlTmef8mJyEsKWUFVLlR-SYF5KynMoZefgGAySLDppdcCHp62T4jcl67DbJqoWNi3NSVBAwWfZd6P3gxjaBzr7e8f16h6EfvcFk1bnBweCe8Yx8qqEJeH6Yp-T--vuv5Y90fXuzWi7WqRGKDaksCmuqshR1Pa9rWVSl5LUUVWUZzitVglXIxZxxoQxgxQwXYKxVAmjJ8wLzU_Jln_vk-z8jhkG3LhhsGuiwH4NmUnLBVXw_SvleanwfgsdaP3nXgt9pRvVEVW_1RFVPVDXjOlKNpotD_li1aN8sfzFGweVegPHLZ4deB-OwM2idRzNo27v_51_9YzeN65yB5hF3GLaRaywo_qFDNOifU69TrUxEt-Q0fwGP451I</recordid><startdate>20150401</startdate><enddate>20150401</enddate><creator>Wang, Weisheng, PhD</creator><creator>Luo, Jiawei, PhD</creator><creator>Yang, Xuedong, PhD</creator><creator>Lin, Hongli, PhD</creator><general>Elsevier Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20150401</creationdate><title>Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative</title><author>Wang, Weisheng, PhD ; Luo, Jiawei, PhD ; Yang, Xuedong, PhD ; Lin, Hongli, PhD</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c481t-655dcb774ff9ff65b762f64bbd1e9b87ad8e2491248caeb1c24acdd84a07235e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Databases, Factual - statistics &amp; numerical data</topic><topic>Humans</topic><topic>Imaging, Three-Dimensional</topic><topic>LIDC/DIRI database</topic><topic>Lung - diagnostic imaging</topic><topic>Lung Neoplasms - diagnostic imaging</topic><topic>lung nodule</topic><topic>quantitative analysis</topic><topic>Radiographic Image Interpretation, Computer-Assisted</topic><topic>Radiology</topic><topic>Radiology Information Systems - statistics &amp; numerical data</topic><topic>Software</topic><topic>Solitary Pulmonary Nodule - diagnostic imaging</topic><topic>Tomography, X-Ray Computed - statistics &amp; numerical data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Weisheng, PhD</creatorcontrib><creatorcontrib>Luo, Jiawei, PhD</creatorcontrib><creatorcontrib>Yang, Xuedong, PhD</creatorcontrib><creatorcontrib>Lin, Hongli, PhD</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Academic radiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Weisheng, PhD</au><au>Luo, Jiawei, PhD</au><au>Yang, Xuedong, PhD</au><au>Lin, Hongli, PhD</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative</atitle><jtitle>Academic radiology</jtitle><addtitle>Acad Radiol</addtitle><date>2015-04-01</date><risdate>2015</risdate><volume>22</volume><issue>4</issue><spage>488</spage><epage>495</epage><pages>488-495</pages><issn>1076-6332</issn><eissn>1878-4046</eissn><abstract>Rationale and Objectives The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) is the largest publicly available computed tomography (CT) image reference data set of lung nodules. In this article, a comprehensive data analysis of the data set and a uniform data model are presented with the purpose of facilitating potential researchers to have an in-depth understanding to and efficient use of the data set in their lung cancer–related investigations. Materials and Methods A uniform data model was designed for representation and organization of various types of information contained in different source data files. A software tool was developed for the processing and analysis of the database, which 1) automatically aligns and graphically displays the nodule outlines marked manually by radiologists onto the corresponding CT images; 2) extracts diagnostic nodule characteristics annotated by radiologists; 3) calculates a variety of nodule image features based on the outlines of nodules, including diameter, volume, and degree of roundness, and so forth; 4) integrates all the extracted nodule information into the uniform data model and stores it in a common and easy-to-access data format; and 5) analyzes and summarizes various feature distributions of nodules in several different categories. Using this data processing and analysis tool, all 1018 CT scans from the data set were processed and analyzed for their statistical distribution. Results The information contained in different source data files with different formats was extracted and integrated into a new and uniform data model. Based on the new data model, the statistical distributions of nodules in terms of nodule geometric features and diagnostic characteristics were summarized. In the LIDC/IDRI data set, 2655 nodules ≥3 mm, 5875 nodules &lt;3 mm, and 7411 non-nodules are identified, respectively. Among the 2655 nodules, 1) 775, 488, 481, and 911 were marked by one, two, three, or four radiologists, respectively; 2) most of nodules ≥3 mm (85.7%) have a diameter &lt;10.0 mm with the mean value of 6.72 mm; and 3) 10.87%, 31.4%, 38.8%, 16.4%, and 2.6% of nodules were assessed with a malignancy score of 1, 2, 3, 4, and 5, respectively. Conclusions This study demonstrates the usefulness of the proposed software tool to the potential users for an in-depth understanding of the LIDC/IDRI data set, therefore likely to be beneficial to their future investigations. The analysis results also demonstrate the distribution diversity of nodules characteristics, therefore being useful as a reference resource for assessing the performance of a new and existing nodule detection and/or segmentation schemes.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>25601306</pmid><doi>10.1016/j.acra.2014.12.004</doi><tpages>8</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1076-6332
ispartof Academic radiology, 2015-04, Vol.22 (4), p.488-495
issn 1076-6332
1878-4046
language eng
recordid cdi_proquest_miscellaneous_1662428015
source MEDLINE; Elsevier ScienceDirect Journals
subjects Databases, Factual - statistics & numerical data
Humans
Imaging, Three-Dimensional
LIDC/DIRI database
Lung - diagnostic imaging
Lung Neoplasms - diagnostic imaging
lung nodule
quantitative analysis
Radiographic Image Interpretation, Computer-Assisted
Radiology
Radiology Information Systems - statistics & numerical data
Software
Solitary Pulmonary Nodule - diagnostic imaging
Tomography, X-Ray Computed - statistics & numerical data
title Data Analysis of the Lung Imaging Database Consortium and Image Database Resource Initiative
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T20%3A19%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20Analysis%20of%20the%20Lung%20Imaging%20Database%20Consortium%20and%20Image%20Database%20Resource%20Initiative&rft.jtitle=Academic%20radiology&rft.au=Wang,%20Weisheng,%20PhD&rft.date=2015-04-01&rft.volume=22&rft.issue=4&rft.spage=488&rft.epage=495&rft.pages=488-495&rft.issn=1076-6332&rft.eissn=1878-4046&rft_id=info:doi/10.1016/j.acra.2014.12.004&rft_dat=%3Cproquest_cross%3E1662428015%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1662428015&rft_id=info:pmid/25601306&rft_els_id=S1076633214004620&rfr_iscdi=true