Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques
Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and t...
Gespeichert in:
Veröffentlicht in: | Cancers 2022-08, Vol.14 (16), p.3914 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 16 |
container_start_page | 3914 |
container_title | Cancers |
container_volume | 14 |
creator | Chaganti, Rajasekhar Rustam, Furqan De La Torre Díez, Isabel Mazón, Juan Luis Vidal Rodríguez, Carmen Lili Ashraf, Imran |
description | Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and the feature engineering part is less investigated. To overcome these limitations, this study presents an approach that investigates feature engineering for machine learning and deep learning models. Forward feature selection, backward feature elimination, bidirectional feature elimination, and machine learning-based feature selection using extra tree classifiers are adopted. The proposed approach can predict Hashimoto’s thyroiditis (primary hypothyroid), binding protein (increased binding protein), autoimmune thyroiditis (compensated hypothyroid), and non-thyroidal syndrome (NTIS) (concurrent non-thyroidal illness). Extensive experiments show that the extra tree classifier-based selected feature yields the best results with 0.99 accuracy and an F1 score when used with the random forest classifier. Results suggest that the machine learning models are a better choice for thyroid disease detection regarding the provided accuracy and the computational complexity. K-fold cross-validation and performance comparison with existing studies corroborate the superior performance of the proposed approach. |
doi_str_mv | 10.3390/cancers14163914 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9405591</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A745271262</galeid><sourcerecordid>A745271262</sourcerecordid><originalsourceid>FETCH-LOGICAL-c465t-200e18834ada9e73a3d690df70fd0ea9c7fbdabfbc6d61bc973b6a7f55917bc83</originalsourceid><addsrcrecordid>eNptkk1PJCEQholZo0Y9e-3Ey15GoemG4bKJcf1KZqOJ45lUQzGD6QEXuk3899LR6GoWDkDx1Eu9pAg5YvSEc0VPDQSDKbOGCa5Ys0X2airrmRCq-fHPfpcc5vxIy-CcSSF3yC4XlFFF5R65X65fUvS2-u0zQsbqLqH1ZvAxVA_Zh1V1jz2W8zNWlwjDmDBXEGz1B8zaB6wWCClM3BLNOvi_I-YDsu2gz3j4vu6Th8uL5fn1bHF7dXN-tpiZRrTDrKYU2XzOG7CgUHLgVihqnaTOUgRlpOssdK4zwgrWGSV5J0C6tlVMdmbO98mvN92nsdugNRiGBL1-Sn4D6UVH8PrrTfBrvYrPWjV0EikCP98FUpwKH_TGZ4N9DwHjmHUtqRTlo1pe0ONv6GMcUyj2Jkqwui1OPqkV9Kh9cLG8ayZRfSabtpasFnWhTv5DlWlx400M6HyJf0k4fUswKeac0H14ZFRPraC_tQJ_BQR-pos</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2706125883</pqid></control><display><type>article</type><title>Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques</title><source>PubMed Central Open Access</source><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Chaganti, Rajasekhar ; Rustam, Furqan ; De La Torre Díez, Isabel ; Mazón, Juan Luis Vidal ; Rodríguez, Carmen Lili ; Ashraf, Imran</creator><creatorcontrib>Chaganti, Rajasekhar ; Rustam, Furqan ; De La Torre Díez, Isabel ; Mazón, Juan Luis Vidal ; Rodríguez, Carmen Lili ; Ashraf, Imran</creatorcontrib><description>Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and the feature engineering part is less investigated. To overcome these limitations, this study presents an approach that investigates feature engineering for machine learning and deep learning models. Forward feature selection, backward feature elimination, bidirectional feature elimination, and machine learning-based feature selection using extra tree classifiers are adopted. The proposed approach can predict Hashimoto’s thyroiditis (primary hypothyroid), binding protein (increased binding protein), autoimmune thyroiditis (compensated hypothyroid), and non-thyroidal syndrome (NTIS) (concurrent non-thyroidal illness). Extensive experiments show that the extra tree classifier-based selected feature yields the best results with 0.99 accuracy and an F1 score when used with the random forest classifier. Results suggest that the machine learning models are a better choice for thyroid disease detection regarding the provided accuracy and the computational complexity. K-fold cross-validation and performance comparison with existing studies corroborate the superior performance of the proposed approach.</description><identifier>ISSN: 2072-6694</identifier><identifier>EISSN: 2072-6694</identifier><identifier>DOI: 10.3390/cancers14163914</identifier><identifier>PMID: 36010907</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Accuracy ; Algorithms ; Classification ; Computer applications ; Data mining ; Data processing ; Datasets ; Deep learning ; Diagnosis ; Feature selection ; Forecasts and trends ; Health care ; Hyperthyroidism ; Hypothyroidism ; Learning algorithms ; Machine learning ; Neural networks ; Predictions ; Support vector machines ; Thyroid cancer ; Thyroid diseases ; Thyroiditis</subject><ispartof>Cancers, 2022-08, Vol.14 (16), p.3914</ispartof><rights>COPYRIGHT 2022 MDPI AG</rights><rights>2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2022 by the authors. 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c465t-200e18834ada9e73a3d690df70fd0ea9c7fbdabfbc6d61bc973b6a7f55917bc83</citedby><cites>FETCH-LOGICAL-c465t-200e18834ada9e73a3d690df70fd0ea9c7fbdabfbc6d61bc973b6a7f55917bc83</cites><orcidid>0000-0002-0982-815X ; 0000-0001-8403-1047 ; 0000-0002-9609-4026 ; 0000-0003-3134-7720 ; 0000-0001-5341-6729 ; 0000-0002-8271-6496</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9405591/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9405591/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids></links><search><creatorcontrib>Chaganti, Rajasekhar</creatorcontrib><creatorcontrib>Rustam, Furqan</creatorcontrib><creatorcontrib>De La Torre Díez, Isabel</creatorcontrib><creatorcontrib>Mazón, Juan Luis Vidal</creatorcontrib><creatorcontrib>Rodríguez, Carmen Lili</creatorcontrib><creatorcontrib>Ashraf, Imran</creatorcontrib><title>Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques</title><title>Cancers</title><description>Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and the feature engineering part is less investigated. To overcome these limitations, this study presents an approach that investigates feature engineering for machine learning and deep learning models. Forward feature selection, backward feature elimination, bidirectional feature elimination, and machine learning-based feature selection using extra tree classifiers are adopted. The proposed approach can predict Hashimoto’s thyroiditis (primary hypothyroid), binding protein (increased binding protein), autoimmune thyroiditis (compensated hypothyroid), and non-thyroidal syndrome (NTIS) (concurrent non-thyroidal illness). Extensive experiments show that the extra tree classifier-based selected feature yields the best results with 0.99 accuracy and an F1 score when used with the random forest classifier. Results suggest that the machine learning models are a better choice for thyroid disease detection regarding the provided accuracy and the computational complexity. K-fold cross-validation and performance comparison with existing studies corroborate the superior performance of the proposed approach.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Classification</subject><subject>Computer applications</subject><subject>Data mining</subject><subject>Data processing</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Diagnosis</subject><subject>Feature selection</subject><subject>Forecasts and trends</subject><subject>Health care</subject><subject>Hyperthyroidism</subject><subject>Hypothyroidism</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Predictions</subject><subject>Support vector machines</subject><subject>Thyroid cancer</subject><subject>Thyroid diseases</subject><subject>Thyroiditis</subject><issn>2072-6694</issn><issn>2072-6694</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNptkk1PJCEQholZo0Y9e-3Ey15GoemG4bKJcf1KZqOJ45lUQzGD6QEXuk3899LR6GoWDkDx1Eu9pAg5YvSEc0VPDQSDKbOGCa5Ys0X2airrmRCq-fHPfpcc5vxIy-CcSSF3yC4XlFFF5R65X65fUvS2-u0zQsbqLqH1ZvAxVA_Zh1V1jz2W8zNWlwjDmDBXEGz1B8zaB6wWCClM3BLNOvi_I-YDsu2gz3j4vu6Th8uL5fn1bHF7dXN-tpiZRrTDrKYU2XzOG7CgUHLgVihqnaTOUgRlpOssdK4zwgrWGSV5J0C6tlVMdmbO98mvN92nsdugNRiGBL1-Sn4D6UVH8PrrTfBrvYrPWjV0EikCP98FUpwKH_TGZ4N9DwHjmHUtqRTlo1pe0ONv6GMcUyj2Jkqwui1OPqkV9Kh9cLG8ayZRfSabtpasFnWhTv5DlWlx400M6HyJf0k4fUswKeac0H14ZFRPraC_tQJ_BQR-pos</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Chaganti, Rajasekhar</creator><creator>Rustam, Furqan</creator><creator>De La Torre Díez, Isabel</creator><creator>Mazón, Juan Luis Vidal</creator><creator>Rodríguez, Carmen Lili</creator><creator>Ashraf, Imran</creator><general>MDPI AG</general><general>MDPI</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7T5</scope><scope>7TO</scope><scope>7XB</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>LK8</scope><scope>M2O</scope><scope>M7P</scope><scope>MBDVC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0982-815X</orcidid><orcidid>https://orcid.org/0000-0001-8403-1047</orcidid><orcidid>https://orcid.org/0000-0002-9609-4026</orcidid><orcidid>https://orcid.org/0000-0003-3134-7720</orcidid><orcidid>https://orcid.org/0000-0001-5341-6729</orcidid><orcidid>https://orcid.org/0000-0002-8271-6496</orcidid></search><sort><creationdate>20220801</creationdate><title>Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques</title><author>Chaganti, Rajasekhar ; Rustam, Furqan ; De La Torre Díez, Isabel ; Mazón, Juan Luis Vidal ; Rodríguez, Carmen Lili ; Ashraf, Imran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c465t-200e18834ada9e73a3d690df70fd0ea9c7fbdabfbc6d61bc973b6a7f55917bc83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Classification</topic><topic>Computer applications</topic><topic>Data mining</topic><topic>Data processing</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Diagnosis</topic><topic>Feature selection</topic><topic>Forecasts and trends</topic><topic>Health care</topic><topic>Hyperthyroidism</topic><topic>Hypothyroidism</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Predictions</topic><topic>Support vector machines</topic><topic>Thyroid cancer</topic><topic>Thyroid diseases</topic><topic>Thyroiditis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chaganti, Rajasekhar</creatorcontrib><creatorcontrib>Rustam, Furqan</creatorcontrib><creatorcontrib>De La Torre Díez, Isabel</creatorcontrib><creatorcontrib>Mazón, Juan Luis Vidal</creatorcontrib><creatorcontrib>Rodríguez, Carmen Lili</creatorcontrib><creatorcontrib>Ashraf, Imran</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Immunology Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Research Library</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Cancers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chaganti, Rajasekhar</au><au>Rustam, Furqan</au><au>De La Torre Díez, Isabel</au><au>Mazón, Juan Luis Vidal</au><au>Rodríguez, Carmen Lili</au><au>Ashraf, Imran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques</atitle><jtitle>Cancers</jtitle><date>2022-08-01</date><risdate>2022</risdate><volume>14</volume><issue>16</issue><spage>3914</spage><pages>3914-</pages><issn>2072-6694</issn><eissn>2072-6694</eissn><abstract>Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and the feature engineering part is less investigated. To overcome these limitations, this study presents an approach that investigates feature engineering for machine learning and deep learning models. Forward feature selection, backward feature elimination, bidirectional feature elimination, and machine learning-based feature selection using extra tree classifiers are adopted. The proposed approach can predict Hashimoto’s thyroiditis (primary hypothyroid), binding protein (increased binding protein), autoimmune thyroiditis (compensated hypothyroid), and non-thyroidal syndrome (NTIS) (concurrent non-thyroidal illness). Extensive experiments show that the extra tree classifier-based selected feature yields the best results with 0.99 accuracy and an F1 score when used with the random forest classifier. Results suggest that the machine learning models are a better choice for thyroid disease detection regarding the provided accuracy and the computational complexity. K-fold cross-validation and performance comparison with existing studies corroborate the superior performance of the proposed approach.</abstract><cop>Basel</cop><pub>MDPI AG</pub><pmid>36010907</pmid><doi>10.3390/cancers14163914</doi><orcidid>https://orcid.org/0000-0002-0982-815X</orcidid><orcidid>https://orcid.org/0000-0001-8403-1047</orcidid><orcidid>https://orcid.org/0000-0002-9609-4026</orcidid><orcidid>https://orcid.org/0000-0003-3134-7720</orcidid><orcidid>https://orcid.org/0000-0001-5341-6729</orcidid><orcidid>https://orcid.org/0000-0002-8271-6496</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2072-6694 |
ispartof | Cancers, 2022-08, Vol.14 (16), p.3914 |
issn | 2072-6694 2072-6694 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9405591 |
source | PubMed Central Open Access; MDPI - Multidisciplinary Digital Publishing Institute; EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Accuracy Algorithms Classification Computer applications Data mining Data processing Datasets Deep learning Diagnosis Feature selection Forecasts and trends Health care Hyperthyroidism Hypothyroidism Learning algorithms Machine learning Neural networks Predictions Support vector machines Thyroid cancer Thyroid diseases Thyroiditis |
title | Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A49%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Thyroid%20Disease%20Prediction%20Using%20Selective%20Features%20and%20Machine%20Learning%20Techniques&rft.jtitle=Cancers&rft.au=Chaganti,%20Rajasekhar&rft.date=2022-08-01&rft.volume=14&rft.issue=16&rft.spage=3914&rft.pages=3914-&rft.issn=2072-6694&rft.eissn=2072-6694&rft_id=info:doi/10.3390/cancers14163914&rft_dat=%3Cgale_pubme%3EA745271262%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2706125883&rft_id=info:pmid/36010907&rft_galeid=A745271262&rfr_iscdi=true |