Data mining and genetic algorithm based gene/SNP selection

Objective: Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. T...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Artificial intelligence in medicine 2004-07, Vol.31 (3), p.183-196
Hauptverfasser:	Shah, Shital C., Kusiak, Andrew
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial intelligence Computer applications Data mining Decision Trees DNA sequence analysis Drug effectiveness Drug Therapy Feature selection Genes Genetic algorithm Genetics Genomics Humans Information Storage and Retrieval Intersection approach Medicine Models, Genetic Polymorphism, Single Nucleotide Prognosis Single nucleotide polymorphisms Single nucleotide polymorphisms (SNPs) Treatment Outcome
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	196
container_issue	3
container_start_page	183
container_title	Artificial intelligence in medicine
container_volume	31
creator	Shah, Shital C. Kusiak, Andrew
description	Objective: Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. The growing wealth of information and advances in biology call for the development of approaches for discovery of new knowledge. One such area is the identification of gene/SNP patterns impacting cure/drug development for various diseases. Methods: A new approach for predicting drug effectiveness is presented. The approach is based on data mining and genetic algorithms. A global search mechanism, weighted decision tree, decision-tree-based wrapper, a correlation-based heuristic, and the identification of intersecting feature sets are employed for selecting significant genes. Results: The feature selection approach has resulted in 85% reduction of number of features. The relative increase in cross-validation accuracy and specificity for the significant gene/SNP set was 10% and 3.2%, respectively. Conclusion: The feature selection approach was successfully applied to data sets for drug and placebo subjects. The number of features has been significantly reduced while the quality of knowledge was enhanced. The feature set intersection approach provided the most significant genes/SNPs. The results reported in the paper discuss associations among SNPs resulting in patient-specific treatment protocols.
doi_str_mv	10.1016/j.artmed.2004.04.002
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_66773192</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0933365704000521</els_id><sourcerecordid>66773192</sourcerecordid><originalsourceid>FETCH-LOGICAL-c420t-41c991da80a9b6dd292c5f07e178709b66cb0d74075d86bc4cb7565994005ce93</originalsourceid><addsrcrecordid>eNqFkdtqGzEQhkVJqB03bxDKXuVu7dGujr0oBOfQgGkLba6FVho7MntwpHWgb99d1vSyhoGBmW8O_D8hNxSWFKhY7Zc29g36ZQHAlmNA8YHMqZJlXigBF2QOuizzUnA5I1cp7QFAMio-khnlJRSg-Jx8ube9zZrQhnaX2dZnO2yxDy6z9a6LoX9tssomnOqrX99_ZglrdH3o2k_kcmvrhNenvCAvjw-_19_yzY-n5_XdJnesgD5n1GlNvVVgdSW8L3Th-BYkUqkkDCXhKvCSgeReicoxV0kuuNYMgDvU5YLcTnsPsXs7YupNE5LDurYtdsdkhJCypLo4C3IpKONUnAWp5KoUSg0gm0AXu5Qibs0hhsbGP4aCGV0wezO5YEYXzBgwPvL5tP9Yjb1_QyfZB-DrBOCg23vAaJIL2Dr0IQ7iGt-F_1_4C9Cgl_8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>17583688</pqid></control><display><type>article</type><title>Data mining and genetic algorithm based gene/SNP selection</title><source>MEDLINE</source><source>ScienceDirect Journals (5 years ago - present)</source><creator>Shah, Shital C. ; Kusiak, Andrew</creator><creatorcontrib>Shah, Shital C. ; Kusiak, Andrew</creatorcontrib><description>Objective: Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. The growing wealth of information and advances in biology call for the development of approaches for discovery of new knowledge. One such area is the identification of gene/SNP patterns impacting cure/drug development for various diseases. Methods: A new approach for predicting drug effectiveness is presented. The approach is based on data mining and genetic algorithms. A global search mechanism, weighted decision tree, decision-tree-based wrapper, a correlation-based heuristic, and the identification of intersecting feature sets are employed for selecting significant genes. Results: The feature selection approach has resulted in 85% reduction of number of features. The relative increase in cross-validation accuracy and specificity for the significant gene/SNP set was 10% and 3.2%, respectively. Conclusion: The feature selection approach was successfully applied to data sets for drug and placebo subjects. The number of features has been significantly reduced while the quality of knowledge was enhanced. The feature set intersection approach provided the most significant genes/SNPs. The results reported in the paper discuss associations among SNPs resulting in patient-specific treatment protocols.</description><identifier>ISSN: 0933-3657</identifier><identifier>EISSN: 1873-2860</identifier><identifier>DOI: 10.1016/j.artmed.2004.04.002</identifier><identifier>PMID: 15302085</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Algorithms ; Artificial intelligence ; Computer applications ; Data mining ; Decision Trees ; DNA sequence analysis ; Drug effectiveness ; Drug Therapy ; Feature selection ; Genes ; Genetic algorithm ; Genetics ; Genomics ; Humans ; Information Storage and Retrieval ; Intersection approach ; Medicine ; Models, Genetic ; Polymorphism, Single Nucleotide ; Prognosis ; Single nucleotide polymorphisms ; Single nucleotide polymorphisms (SNPs) ; Treatment Outcome</subject><ispartof>Artificial intelligence in medicine, 2004-07, Vol.31 (3), p.183-196</ispartof><rights>2004 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c420t-41c991da80a9b6dd292c5f07e178709b66cb0d74075d86bc4cb7565994005ce93</citedby><cites>FETCH-LOGICAL-c420t-41c991da80a9b6dd292c5f07e178709b66cb0d74075d86bc4cb7565994005ce93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.artmed.2004.04.002$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3548,27923,27924,45994</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/15302085$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Shah, Shital C.</creatorcontrib><creatorcontrib>Kusiak, Andrew</creatorcontrib><title>Data mining and genetic algorithm based gene/SNP selection</title><title>Artificial intelligence in medicine</title><addtitle>Artif Intell Med</addtitle><description>Objective: Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. The growing wealth of information and advances in biology call for the development of approaches for discovery of new knowledge. One such area is the identification of gene/SNP patterns impacting cure/drug development for various diseases. Methods: A new approach for predicting drug effectiveness is presented. The approach is based on data mining and genetic algorithms. A global search mechanism, weighted decision tree, decision-tree-based wrapper, a correlation-based heuristic, and the identification of intersecting feature sets are employed for selecting significant genes. Results: The feature selection approach has resulted in 85% reduction of number of features. The relative increase in cross-validation accuracy and specificity for the significant gene/SNP set was 10% and 3.2%, respectively. Conclusion: The feature selection approach was successfully applied to data sets for drug and placebo subjects. The number of features has been significantly reduced while the quality of knowledge was enhanced. The feature set intersection approach provided the most significant genes/SNPs. The results reported in the paper discuss associations among SNPs resulting in patient-specific treatment protocols.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Computer applications</subject><subject>Data mining</subject><subject>Decision Trees</subject><subject>DNA sequence analysis</subject><subject>Drug effectiveness</subject><subject>Drug Therapy</subject><subject>Feature selection</subject><subject>Genes</subject><subject>Genetic algorithm</subject><subject>Genetics</subject><subject>Genomics</subject><subject>Humans</subject><subject>Information Storage and Retrieval</subject><subject>Intersection approach</subject><subject>Medicine</subject><subject>Models, Genetic</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Prognosis</subject><subject>Single nucleotide polymorphisms</subject><subject>Single nucleotide polymorphisms (SNPs)</subject><subject>Treatment Outcome</subject><issn>0933-3657</issn><issn>1873-2860</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkdtqGzEQhkVJqB03bxDKXuVu7dGujr0oBOfQgGkLba6FVho7MntwpHWgb99d1vSyhoGBmW8O_D8hNxSWFKhY7Zc29g36ZQHAlmNA8YHMqZJlXigBF2QOuizzUnA5I1cp7QFAMio-khnlJRSg-Jx8ube9zZrQhnaX2dZnO2yxDy6z9a6LoX9tssomnOqrX99_ZglrdH3o2k_kcmvrhNenvCAvjw-_19_yzY-n5_XdJnesgD5n1GlNvVVgdSW8L3Th-BYkUqkkDCXhKvCSgeReicoxV0kuuNYMgDvU5YLcTnsPsXs7YupNE5LDurYtdsdkhJCypLo4C3IpKONUnAWp5KoUSg0gm0AXu5Qibs0hhsbGP4aCGV0wezO5YEYXzBgwPvL5tP9Yjb1_QyfZB-DrBOCg23vAaJIL2Dr0IQ7iGt-F_1_4C9Cgl_8</recordid><startdate>20040701</startdate><enddate>20040701</enddate><creator>Shah, Shital C.</creator><creator>Kusiak, Andrew</creator><general>Elsevier B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>E3H</scope><scope>F2A</scope><scope>7X8</scope></search><sort><creationdate>20040701</creationdate><title>Data mining and genetic algorithm based gene/SNP selection</title><author>Shah, Shital C. ; Kusiak, Andrew</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c420t-41c991da80a9b6dd292c5f07e178709b66cb0d74075d86bc4cb7565994005ce93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Computer applications</topic><topic>Data mining</topic><topic>Decision Trees</topic><topic>DNA sequence analysis</topic><topic>Drug effectiveness</topic><topic>Drug Therapy</topic><topic>Feature selection</topic><topic>Genes</topic><topic>Genetic algorithm</topic><topic>Genetics</topic><topic>Genomics</topic><topic>Humans</topic><topic>Information Storage and Retrieval</topic><topic>Intersection approach</topic><topic>Medicine</topic><topic>Models, Genetic</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Prognosis</topic><topic>Single nucleotide polymorphisms</topic><topic>Single nucleotide polymorphisms (SNPs)</topic><topic>Treatment Outcome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shah, Shital C.</creatorcontrib><creatorcontrib>Kusiak, Andrew</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>MEDLINE - Academic</collection><jtitle>Artificial intelligence in medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shah, Shital C.</au><au>Kusiak, Andrew</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data mining and genetic algorithm based gene/SNP selection</atitle><jtitle>Artificial intelligence in medicine</jtitle><addtitle>Artif Intell Med</addtitle><date>2004-07-01</date><risdate>2004</risdate><volume>31</volume><issue>3</issue><spage>183</spage><epage>196</epage><pages>183-196</pages><issn>0933-3657</issn><eissn>1873-2860</eissn><abstract>Objective: Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. The growing wealth of information and advances in biology call for the development of approaches for discovery of new knowledge. One such area is the identification of gene/SNP patterns impacting cure/drug development for various diseases. Methods: A new approach for predicting drug effectiveness is presented. The approach is based on data mining and genetic algorithms. A global search mechanism, weighted decision tree, decision-tree-based wrapper, a correlation-based heuristic, and the identification of intersecting feature sets are employed for selecting significant genes. Results: The feature selection approach has resulted in 85% reduction of number of features. The relative increase in cross-validation accuracy and specificity for the significant gene/SNP set was 10% and 3.2%, respectively. Conclusion: The feature selection approach was successfully applied to data sets for drug and placebo subjects. The number of features has been significantly reduced while the quality of knowledge was enhanced. The feature set intersection approach provided the most significant genes/SNPs. The results reported in the paper discuss associations among SNPs resulting in patient-specific treatment protocols.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>15302085</pmid><doi>10.1016/j.artmed.2004.04.002</doi><tpages>14</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0933-3657
ispartof	Artificial intelligence in medicine, 2004-07, Vol.31 (3), p.183-196
issn	0933-3657 1873-2860
language	eng
recordid	cdi_proquest_miscellaneous_66773192
source	MEDLINE; ScienceDirect Journals (5 years ago - present)
subjects	Algorithms Artificial intelligence Computer applications Data mining Decision Trees DNA sequence analysis Drug effectiveness Drug Therapy Feature selection Genes Genetic algorithm Genetics Genomics Humans Information Storage and Retrieval Intersection approach Medicine Models, Genetic Polymorphism, Single Nucleotide Prognosis Single nucleotide polymorphisms Single nucleotide polymorphisms (SNPs) Treatment Outcome
title	Data mining and genetic algorithm based gene/SNP selection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T19%3A45%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20mining%20and%20genetic%20algorithm%20based%20gene/SNP%20selection&rft.jtitle=Artificial%20intelligence%20in%20medicine&rft.au=Shah,%20Shital%20C.&rft.date=2004-07-01&rft.volume=31&rft.issue=3&rft.spage=183&rft.epage=196&rft.pages=183-196&rft.issn=0933-3657&rft.eissn=1873-2860&rft_id=info:doi/10.1016/j.artmed.2004.04.002&rft_dat=%3Cproquest_cross%3E66773192%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=17583688&rft_id=info:pmid/15302085&rft_els_id=S0933365704000521&rfr_iscdi=true