Non-negative matrix factorization by maximizing correntropy for cancer clustering

Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC bioinformatics 2013-03, Vol.14 (1), p.107-107, Article 107
Hauptverfasser: Wang, Jim Jing-Yan, Wang, Xiaolei, Gao, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 107
container_issue 1
container_start_page 107
container_title BMC bioinformatics
container_volume 14
creator Wang, Jim Jing-Yan
Wang, Xiaolei
Gao, Xin
description Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise. We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm. Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.
doi_str_mv 10.1186/1471-2105-14-107
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3659102</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A534518220</galeid><sourcerecordid>A534518220</sourcerecordid><originalsourceid>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</originalsourceid><addsrcrecordid>eNqNkk1v1DAQhi1ERUvLnROKxKUcUvyZdS5I1YqPShWIAmfL8U6Cq8RebKfa7a_H0ZZtg4qEfJjRzDOvxq-N0EuCzwiR1VvCF6SkBIuS8JLgxRN0tC89fZAfoucxXmNMFhKLZ-iQMkEp4_wIff3sXemg08neQDHoFOymaLVJPtjbXPSuaLa5vrGDvbWuK4wPAVwKfr0tWh8Ko52BHPoxJgiZOEEHre4jvLiLx-jHh_ffl5_Kyy8fL5bnl2VTSZ7KllJOsDSaGQOG48pgzpmRTaMlMLKoG4ZbiTWvoalWnAqTL0x03cBKUqKBHaN3O9312AywMtNSulfrYAcdtsprq-YdZ3-qzt8oVomaYJoFljuBxvp_CMw7xg9qslRNluZMZcezyundGsH_GiEmNdhooO-1Az9GRZioaiZqzP8HZbVkXIiMvv4LvfZjcNnPiRKcMMbZPdXpHpR1rc97mklUnYusQySlOFNnj1D5rGCwxjtoba7PBt7MBjKTYJM6PcaoLr5dzVm8Y03wMQZo9_4RrKY_-phjrx4-3H7gz6dkvwFNVeBf</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1355413343</pqid></control><display><type>article</type><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>SpringerNature Journals</source><source>PubMed Central Open Access</source><source>PubMed Central</source><source>Springer Nature OA/Free Journals</source><creator>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</creator><creatorcontrib>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</creatorcontrib><description>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise. We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm. Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-14-107</identifier><identifier>PMID: 23522344</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Cancer ; Classification ; Cluster Analysis ; Entropy (Information theory) ; Gene expression ; Gene Expression Profiling - methods ; Genetic aspects ; Genomes ; Health sciences ; Humans ; Information theory ; Matrices ; Methodology ; Methods ; Neoplasms - classification ; Neoplasms - genetics ; Neoplasms - metabolism ; Noise ; Oncology, Experimental ; Science ; Studies</subject><ispartof>BMC bioinformatics, 2013-03, Vol.14 (1), p.107-107, Article 107</ispartof><rights>COPYRIGHT 2013 BioMed Central Ltd.</rights><rights>2013 Wang et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2013 Wang et al.; licensee BioMed Central Ltd. 2013 Wang et al.; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</citedby><cites>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3659102/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3659102/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,315,728,781,785,865,886,27929,27930,53796,53798</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23522344$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Jim Jing-Yan</creatorcontrib><creatorcontrib>Wang, Xiaolei</creatorcontrib><creatorcontrib>Gao, Xin</creatorcontrib><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise. We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm. Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Classification</subject><subject>Cluster Analysis</subject><subject>Entropy (Information theory)</subject><subject>Gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>Genetic aspects</subject><subject>Genomes</subject><subject>Health sciences</subject><subject>Humans</subject><subject>Information theory</subject><subject>Matrices</subject><subject>Methodology</subject><subject>Methods</subject><subject>Neoplasms - classification</subject><subject>Neoplasms - genetics</subject><subject>Neoplasms - metabolism</subject><subject>Noise</subject><subject>Oncology, Experimental</subject><subject>Science</subject><subject>Studies</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkk1v1DAQhi1ERUvLnROKxKUcUvyZdS5I1YqPShWIAmfL8U6Cq8RebKfa7a_H0ZZtg4qEfJjRzDOvxq-N0EuCzwiR1VvCF6SkBIuS8JLgxRN0tC89fZAfoucxXmNMFhKLZ-iQMkEp4_wIff3sXemg08neQDHoFOymaLVJPtjbXPSuaLa5vrGDvbWuK4wPAVwKfr0tWh8Ko52BHPoxJgiZOEEHre4jvLiLx-jHh_ffl5_Kyy8fL5bnl2VTSZ7KllJOsDSaGQOG48pgzpmRTaMlMLKoG4ZbiTWvoalWnAqTL0x03cBKUqKBHaN3O9312AywMtNSulfrYAcdtsprq-YdZ3-qzt8oVomaYJoFljuBxvp_CMw7xg9qslRNluZMZcezyundGsH_GiEmNdhooO-1Az9GRZioaiZqzP8HZbVkXIiMvv4LvfZjcNnPiRKcMMbZPdXpHpR1rc97mklUnYusQySlOFNnj1D5rGCwxjtoba7PBt7MBjKTYJM6PcaoLr5dzVm8Y03wMQZo9_4RrKY_-phjrx4-3H7gz6dkvwFNVeBf</recordid><startdate>20130324</startdate><enddate>20130324</enddate><creator>Wang, Jim Jing-Yan</creator><creator>Wang, Xiaolei</creator><creator>Gao, Xin</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130324</creationdate><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><author>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Classification</topic><topic>Cluster Analysis</topic><topic>Entropy (Information theory)</topic><topic>Gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>Genetic aspects</topic><topic>Genomes</topic><topic>Health sciences</topic><topic>Humans</topic><topic>Information theory</topic><topic>Matrices</topic><topic>Methodology</topic><topic>Methods</topic><topic>Neoplasms - classification</topic><topic>Neoplasms - genetics</topic><topic>Neoplasms - metabolism</topic><topic>Noise</topic><topic>Oncology, Experimental</topic><topic>Science</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Jim Jing-Yan</creatorcontrib><creatorcontrib>Wang, Xiaolei</creatorcontrib><creatorcontrib>Gao, Xin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Jim Jing-Yan</au><au>Wang, Xiaolei</au><au>Gao, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Non-negative matrix factorization by maximizing correntropy for cancer clustering</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2013-03-24</date><risdate>2013</risdate><volume>14</volume><issue>1</issue><spage>107</spage><epage>107</epage><pages>107-107</pages><artnum>107</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise. We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm. Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>23522344</pmid><doi>10.1186/1471-2105-14-107</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2013-03, Vol.14 (1), p.107-107, Article 107
issn 1471-2105
1471-2105
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3659102
source MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; SpringerNature Journals; PubMed Central Open Access; PubMed Central; Springer Nature OA/Free Journals
subjects Algorithms
Cancer
Classification
Cluster Analysis
Entropy (Information theory)
Gene expression
Gene Expression Profiling - methods
Genetic aspects
Genomes
Health sciences
Humans
Information theory
Matrices
Methodology
Methods
Neoplasms - classification
Neoplasms - genetics
Neoplasms - metabolism
Noise
Oncology, Experimental
Science
Studies
title Non-negative matrix factorization by maximizing correntropy for cancer clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T00%3A32%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Non-negative%20matrix%20factorization%20by%20maximizing%20correntropy%20for%20cancer%20clustering&rft.jtitle=BMC%20bioinformatics&rft.au=Wang,%20Jim%20Jing-Yan&rft.date=2013-03-24&rft.volume=14&rft.issue=1&rft.spage=107&rft.epage=107&rft.pages=107-107&rft.artnum=107&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-14-107&rft_dat=%3Cgale_pubme%3EA534518220%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1355413343&rft_id=info:pmid/23522344&rft_galeid=A534518220&rfr_iscdi=true