A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, mul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on computational biology and bioinformatics 2012-07, Vol.9 (4), p.1106-1119
Hauptverfasser: Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., de Schaetzen, V., Duque, R., Bersini, H., Nowe, A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1119
container_issue 4
container_start_page 1106
container_title IEEE/ACM transactions on computational biology and bioinformatics
container_volume 9
creator Lazar, C.
Taminau, J.
Meganck, S.
Steenhoff, D.
Coletta, A.
Molter, C.
de Schaetzen, V.
Duque, R.
Bersini, H.
Nowe, A.
description A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filters, wrappers, and embedded methods. More recently, a new group of methods has been added in the general framework of FS: ensemble techniques. The focus in this survey is on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery. We present them in a unified framework, using standardized notations in order to reveal their technical details and to highlight their common characteristics as well as their particularities.
doi_str_mv 10.1109/TCBB.2012.33
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_1020849955</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6152088</ieee_id><sourcerecordid>1020849955</sourcerecordid><originalsourceid>FETCH-LOGICAL-c415t-72d047a0f3f6e3c7fdce42d5c34ed174ead40d451841aadeab5eaea52d3d7b0c3</originalsourceid><addsrcrecordid>eNqN0c9PwjAUB_DGaATRmzcT08SLB4f9ubEjEEATjAfw4mUp7VssGRu2m5H_3i4gB0-e2rSfvLz3vghdU9KnlKSPy_Fo1GeEsj7nJ6hLpUyiNI3FaXsXMpJpzDvowvs1IUykRJyjDmNcEkZJF70P8aJxX7DDVYmntqjB4SXoj9J-NuBxXjk8BVU3DvACCtC1Dc6WeAYl4Mn31oH37dOL1a5SzqkdHpaq2HnrL9FZrgoPV4ezh96mk-X4KZq_zp7Hw3mkBZV1lDBDRKJIzvMYuE5yo0EwIzUXYGgiQBlBjJB0IKhSBtRKggIlmeEmWRHNe-h-X3frqrbpOttYr6EoVAlV4zNK-ICxmCbsH5SRgUhTKQO9-0PXVePCaK1qF0tD1aAe9ipM772DPNs6u1FuF1DWxpO18WRtPBnngd8eijarDZgj_s0jgJs9sABw_I6pDH0N-A_arJLH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1015451382</pqid></control><display><type>article</type><title>A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis</title><source>IEEE Electronic Library (IEL)</source><creator>Lazar, C. ; Taminau, J. ; Meganck, S. ; Steenhoff, D. ; Coletta, A. ; Molter, C. ; de Schaetzen, V. ; Duque, R. ; Bersini, H. ; Nowe, A.</creator><creatorcontrib>Lazar, C. ; Taminau, J. ; Meganck, S. ; Steenhoff, D. ; Coletta, A. ; Molter, C. ; de Schaetzen, V. ; Duque, R. ; Bersini, H. ; Nowe, A.</creatorcontrib><description>A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filters, wrappers, and embedded methods. More recently, a new group of methods has been added in the general framework of FS: ensemble techniques. The focus in this survey is on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery. We present them in a unified framework, using standardized notations in order to reveal their technical details and to highlight their common characteristics as well as their particularities.</description><identifier>ISSN: 1545-5963</identifier><identifier>EISSN: 1557-9964</identifier><identifier>DOI: 10.1109/TCBB.2012.33</identifier><identifier>PMID: 22350210</identifier><identifier>CODEN: ITCBCY</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Analysis of Variance ; Bayes Theorem ; Bioinformatics ; biomarker discovery ; Computational biology ; Computational Biology - methods ; Feature selection ; Gene expression ; gene expression data ; Gene Expression Profiling ; gene prioritization ; gene ranking ; Genetic Markers ; information filters ; Information Theory ; Measurement ; Models, Statistical ; Oligonucleotide Array Sequence Analysis ; ROC Curve ; scoring functions ; Search methods ; Software ; statistical methods ; Statistics, Nonparametric ; Studies ; Taxonomy</subject><ispartof>IEEE/ACM transactions on computational biology and bioinformatics, 2012-07, Vol.9 (4), p.1106-1119</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jul/Aug 2012</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c415t-72d047a0f3f6e3c7fdce42d5c34ed174ead40d451841aadeab5eaea52d3d7b0c3</citedby><cites>FETCH-LOGICAL-c415t-72d047a0f3f6e3c7fdce42d5c34ed174ead40d451841aadeab5eaea52d3d7b0c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6152088$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6152088$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22350210$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lazar, C.</creatorcontrib><creatorcontrib>Taminau, J.</creatorcontrib><creatorcontrib>Meganck, S.</creatorcontrib><creatorcontrib>Steenhoff, D.</creatorcontrib><creatorcontrib>Coletta, A.</creatorcontrib><creatorcontrib>Molter, C.</creatorcontrib><creatorcontrib>de Schaetzen, V.</creatorcontrib><creatorcontrib>Duque, R.</creatorcontrib><creatorcontrib>Bersini, H.</creatorcontrib><creatorcontrib>Nowe, A.</creatorcontrib><title>A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis</title><title>IEEE/ACM transactions on computational biology and bioinformatics</title><addtitle>TCBB</addtitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><description>A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filters, wrappers, and embedded methods. More recently, a new group of methods has been added in the general framework of FS: ensemble techniques. The focus in this survey is on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery. We present them in a unified framework, using standardized notations in order to reveal their technical details and to highlight their common characteristics as well as their particularities.</description><subject>Analysis of Variance</subject><subject>Bayes Theorem</subject><subject>Bioinformatics</subject><subject>biomarker discovery</subject><subject>Computational biology</subject><subject>Computational Biology - methods</subject><subject>Feature selection</subject><subject>Gene expression</subject><subject>gene expression data</subject><subject>Gene Expression Profiling</subject><subject>gene prioritization</subject><subject>gene ranking</subject><subject>Genetic Markers</subject><subject>information filters</subject><subject>Information Theory</subject><subject>Measurement</subject><subject>Models, Statistical</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>ROC Curve</subject><subject>scoring functions</subject><subject>Search methods</subject><subject>Software</subject><subject>statistical methods</subject><subject>Statistics, Nonparametric</subject><subject>Studies</subject><subject>Taxonomy</subject><issn>1545-5963</issn><issn>1557-9964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><sourceid>EIF</sourceid><recordid>eNqN0c9PwjAUB_DGaATRmzcT08SLB4f9ubEjEEATjAfw4mUp7VssGRu2m5H_3i4gB0-e2rSfvLz3vghdU9KnlKSPy_Fo1GeEsj7nJ6hLpUyiNI3FaXsXMpJpzDvowvs1IUykRJyjDmNcEkZJF70P8aJxX7DDVYmntqjB4SXoj9J-NuBxXjk8BVU3DvACCtC1Dc6WeAYl4Mn31oH37dOL1a5SzqkdHpaq2HnrL9FZrgoPV4ezh96mk-X4KZq_zp7Hw3mkBZV1lDBDRKJIzvMYuE5yo0EwIzUXYGgiQBlBjJB0IKhSBtRKggIlmeEmWRHNe-h-X3frqrbpOttYr6EoVAlV4zNK-ICxmCbsH5SRgUhTKQO9-0PXVePCaK1qF0tD1aAe9ipM772DPNs6u1FuF1DWxpO18WRtPBnngd8eijarDZgj_s0jgJs9sABw_I6pDH0N-A_arJLH</recordid><startdate>20120701</startdate><enddate>20120701</enddate><creator>Lazar, C.</creator><creator>Taminau, J.</creator><creator>Meganck, S.</creator><creator>Steenhoff, D.</creator><creator>Coletta, A.</creator><creator>Molter, C.</creator><creator>de Schaetzen, V.</creator><creator>Duque, R.</creator><creator>Bersini, H.</creator><creator>Nowe, A.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20120701</creationdate><title>A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis</title><author>Lazar, C. ; Taminau, J. ; Meganck, S. ; Steenhoff, D. ; Coletta, A. ; Molter, C. ; de Schaetzen, V. ; Duque, R. ; Bersini, H. ; Nowe, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c415t-72d047a0f3f6e3c7fdce42d5c34ed174ead40d451841aadeab5eaea52d3d7b0c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Analysis of Variance</topic><topic>Bayes Theorem</topic><topic>Bioinformatics</topic><topic>biomarker discovery</topic><topic>Computational biology</topic><topic>Computational Biology - methods</topic><topic>Feature selection</topic><topic>Gene expression</topic><topic>gene expression data</topic><topic>Gene Expression Profiling</topic><topic>gene prioritization</topic><topic>gene ranking</topic><topic>Genetic Markers</topic><topic>information filters</topic><topic>Information Theory</topic><topic>Measurement</topic><topic>Models, Statistical</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>ROC Curve</topic><topic>scoring functions</topic><topic>Search methods</topic><topic>Software</topic><topic>statistical methods</topic><topic>Statistics, Nonparametric</topic><topic>Studies</topic><topic>Taxonomy</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lazar, C.</creatorcontrib><creatorcontrib>Taminau, J.</creatorcontrib><creatorcontrib>Meganck, S.</creatorcontrib><creatorcontrib>Steenhoff, D.</creatorcontrib><creatorcontrib>Coletta, A.</creatorcontrib><creatorcontrib>Molter, C.</creatorcontrib><creatorcontrib>de Schaetzen, V.</creatorcontrib><creatorcontrib>Duque, R.</creatorcontrib><creatorcontrib>Bersini, H.</creatorcontrib><creatorcontrib>Nowe, A.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lazar, C.</au><au>Taminau, J.</au><au>Meganck, S.</au><au>Steenhoff, D.</au><au>Coletta, A.</au><au>Molter, C.</au><au>de Schaetzen, V.</au><au>Duque, R.</au><au>Bersini, H.</au><au>Nowe, A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis</atitle><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle><stitle>TCBB</stitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><date>2012-07-01</date><risdate>2012</risdate><volume>9</volume><issue>4</issue><spage>1106</spage><epage>1119</epage><pages>1106-1119</pages><issn>1545-5963</issn><eissn>1557-9964</eissn><coden>ITCBCY</coden><abstract>A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas like combinatorial chemistry, text mining, multivariate imaging, or bioinformatics. As a general accepted rule, these methods are grouped in filters, wrappers, and embedded methods. More recently, a new group of methods has been added in the general framework of FS: ensemble techniques. The focus in this survey is on filter feature selection methods for informative feature discovery in gene expression microarray (GEM) analysis, which is also known as differentially expressed genes (DEGs) discovery, gene prioritization, or biomarker discovery. We present them in a unified framework, using standardized notations in order to reveal their technical details and to highlight their common characteristics as well as their particularities.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>22350210</pmid><doi>10.1109/TCBB.2012.33</doi><tpages>14</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1545-5963
ispartof IEEE/ACM transactions on computational biology and bioinformatics, 2012-07, Vol.9 (4), p.1106-1119
issn 1545-5963
1557-9964
language eng
recordid cdi_proquest_miscellaneous_1020849955
source IEEE Electronic Library (IEL)
subjects Analysis of Variance
Bayes Theorem
Bioinformatics
biomarker discovery
Computational biology
Computational Biology - methods
Feature selection
Gene expression
gene expression data
Gene Expression Profiling
gene prioritization
gene ranking
Genetic Markers
information filters
Information Theory
Measurement
Models, Statistical
Oligonucleotide Array Sequence Analysis
ROC Curve
scoring functions
Search methods
Software
statistical methods
Statistics, Nonparametric
Studies
Taxonomy
title A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T06%3A27%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Survey%20on%20Filter%20Techniques%20for%20Feature%20Selection%20in%20Gene%20Expression%20Microarray%20Analysis&rft.jtitle=IEEE/ACM%20transactions%20on%20computational%20biology%20and%20bioinformatics&rft.au=Lazar,%20C.&rft.date=2012-07-01&rft.volume=9&rft.issue=4&rft.spage=1106&rft.epage=1119&rft.pages=1106-1119&rft.issn=1545-5963&rft.eissn=1557-9964&rft.coden=ITCBCY&rft_id=info:doi/10.1109/TCBB.2012.33&rft_dat=%3Cproquest_RIE%3E1020849955%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1015451382&rft_id=info:pmid/22350210&rft_ieee_id=6152088&rfr_iscdi=true