Non-negative matrix factorization by maximizing correntropy for cancer clustering
Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the...
Gespeichert in:
Veröffentlicht in: | BMC bioinformatics 2013-03, Vol.14 (1), p.107-107, Article 107 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 107 |
---|---|
container_issue | 1 |
container_start_page | 107 |
container_title | BMC bioinformatics |
container_volume | 14 |
creator | Wang, Jim Jing-Yan Wang, Xiaolei Gao, Xin |
description | Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise.
We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm.
Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering. |
doi_str_mv | 10.1186/1471-2105-14-107 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3659102</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A534518220</galeid><sourcerecordid>A534518220</sourcerecordid><originalsourceid>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</originalsourceid><addsrcrecordid>eNqNkk1v1DAQhi1ERUvLnROKxKUcUvyZdS5I1YqPShWIAmfL8U6Cq8RebKfa7a_H0ZZtg4qEfJjRzDOvxq-N0EuCzwiR1VvCF6SkBIuS8JLgxRN0tC89fZAfoucxXmNMFhKLZ-iQMkEp4_wIff3sXemg08neQDHoFOymaLVJPtjbXPSuaLa5vrGDvbWuK4wPAVwKfr0tWh8Ko52BHPoxJgiZOEEHre4jvLiLx-jHh_ffl5_Kyy8fL5bnl2VTSZ7KllJOsDSaGQOG48pgzpmRTaMlMLKoG4ZbiTWvoalWnAqTL0x03cBKUqKBHaN3O9312AywMtNSulfrYAcdtsprq-YdZ3-qzt8oVomaYJoFljuBxvp_CMw7xg9qslRNluZMZcezyundGsH_GiEmNdhooO-1Az9GRZioaiZqzP8HZbVkXIiMvv4LvfZjcNnPiRKcMMbZPdXpHpR1rc97mklUnYusQySlOFNnj1D5rGCwxjtoba7PBt7MBjKTYJM6PcaoLr5dzVm8Y03wMQZo9_4RrKY_-phjrx4-3H7gz6dkvwFNVeBf</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1355413343</pqid></control><display><type>article</type><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>SpringerNature Journals</source><source>PubMed Central Open Access</source><source>PubMed Central</source><source>Springer Nature OA/Free Journals</source><creator>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</creator><creatorcontrib>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</creatorcontrib><description>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise.
We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm.
Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-14-107</identifier><identifier>PMID: 23522344</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Cancer ; Classification ; Cluster Analysis ; Entropy (Information theory) ; Gene expression ; Gene Expression Profiling - methods ; Genetic aspects ; Genomes ; Health sciences ; Humans ; Information theory ; Matrices ; Methodology ; Methods ; Neoplasms - classification ; Neoplasms - genetics ; Neoplasms - metabolism ; Noise ; Oncology, Experimental ; Science ; Studies</subject><ispartof>BMC bioinformatics, 2013-03, Vol.14 (1), p.107-107, Article 107</ispartof><rights>COPYRIGHT 2013 BioMed Central Ltd.</rights><rights>2013 Wang et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2013 Wang et al.; licensee BioMed Central Ltd. 2013 Wang et al.; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</citedby><cites>FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3659102/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3659102/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,315,728,781,785,865,886,27929,27930,53796,53798</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23522344$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Jim Jing-Yan</creatorcontrib><creatorcontrib>Wang, Xiaolei</creatorcontrib><creatorcontrib>Gao, Xin</creatorcontrib><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise.
We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm.
Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Classification</subject><subject>Cluster Analysis</subject><subject>Entropy (Information theory)</subject><subject>Gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>Genetic aspects</subject><subject>Genomes</subject><subject>Health sciences</subject><subject>Humans</subject><subject>Information theory</subject><subject>Matrices</subject><subject>Methodology</subject><subject>Methods</subject><subject>Neoplasms - classification</subject><subject>Neoplasms - genetics</subject><subject>Neoplasms - metabolism</subject><subject>Noise</subject><subject>Oncology, Experimental</subject><subject>Science</subject><subject>Studies</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkk1v1DAQhi1ERUvLnROKxKUcUvyZdS5I1YqPShWIAmfL8U6Cq8RebKfa7a_H0ZZtg4qEfJjRzDOvxq-N0EuCzwiR1VvCF6SkBIuS8JLgxRN0tC89fZAfoucxXmNMFhKLZ-iQMkEp4_wIff3sXemg08neQDHoFOymaLVJPtjbXPSuaLa5vrGDvbWuK4wPAVwKfr0tWh8Ko52BHPoxJgiZOEEHre4jvLiLx-jHh_ffl5_Kyy8fL5bnl2VTSZ7KllJOsDSaGQOG48pgzpmRTaMlMLKoG4ZbiTWvoalWnAqTL0x03cBKUqKBHaN3O9312AywMtNSulfrYAcdtsprq-YdZ3-qzt8oVomaYJoFljuBxvp_CMw7xg9qslRNluZMZcezyundGsH_GiEmNdhooO-1Az9GRZioaiZqzP8HZbVkXIiMvv4LvfZjcNnPiRKcMMbZPdXpHpR1rc97mklUnYusQySlOFNnj1D5rGCwxjtoba7PBt7MBjKTYJM6PcaoLr5dzVm8Y03wMQZo9_4RrKY_-phjrx4-3H7gz6dkvwFNVeBf</recordid><startdate>20130324</startdate><enddate>20130324</enddate><creator>Wang, Jim Jing-Yan</creator><creator>Wang, Xiaolei</creator><creator>Gao, Xin</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130324</creationdate><title>Non-negative matrix factorization by maximizing correntropy for cancer clustering</title><author>Wang, Jim Jing-Yan ; Wang, Xiaolei ; Gao, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b684t-f224108ca3ccec406c0443c8bba8e3179b30f80a49eb6d425c1861a9bed821ae3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Classification</topic><topic>Cluster Analysis</topic><topic>Entropy (Information theory)</topic><topic>Gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>Genetic aspects</topic><topic>Genomes</topic><topic>Health sciences</topic><topic>Humans</topic><topic>Information theory</topic><topic>Matrices</topic><topic>Methodology</topic><topic>Methods</topic><topic>Neoplasms - classification</topic><topic>Neoplasms - genetics</topic><topic>Neoplasms - metabolism</topic><topic>Noise</topic><topic>Oncology, Experimental</topic><topic>Science</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Jim Jing-Yan</creatorcontrib><creatorcontrib>Wang, Xiaolei</creatorcontrib><creatorcontrib>Gao, Xin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Jim Jing-Yan</au><au>Wang, Xiaolei</au><au>Gao, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Non-negative matrix factorization by maximizing correntropy for cancer clustering</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2013-03-24</date><risdate>2013</risdate><volume>14</volume><issue>1</issue><spage>107</spage><epage>107</epage><pages>107-107</pages><artnum>107</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Non-negative matrix factorization (NMF) has been shown to be a powerful tool for clustering gene expression data, which are widely used to classify cancers. NMF aims to find two non-negative matrices whose product closely approximates the original matrix. Traditional NMF methods minimize either the l2 norm or the Kullback-Leibler distance between the product of the two matrices and the original matrix. Correntropy was recently shown to be an effective similarity measurement due to its stability to outliers or noise.
We propose a maximum correntropy criterion (MCC)-based NMF method (NMF-MCC) for gene expression data-based cancer clustering. Instead of minimizing the l2 norm or the Kullback-Leibler distance, NMF-MCC maximizes the correntropy between the product of the two matrices and the original matrix. The optimization problem can be solved by an expectation conditional maximization algorithm.
Extensive experiments on six cancer benchmark sets demonstrate that the proposed method is significantly more accurate than the state-of-the-art methods in cancer clustering.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>23522344</pmid><doi>10.1186/1471-2105-14-107</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2013-03, Vol.14 (1), p.107-107, Article 107 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3659102 |
source | MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; SpringerNature Journals; PubMed Central Open Access; PubMed Central; Springer Nature OA/Free Journals |
subjects | Algorithms Cancer Classification Cluster Analysis Entropy (Information theory) Gene expression Gene Expression Profiling - methods Genetic aspects Genomes Health sciences Humans Information theory Matrices Methodology Methods Neoplasms - classification Neoplasms - genetics Neoplasms - metabolism Noise Oncology, Experimental Science Studies |
title | Non-negative matrix factorization by maximizing correntropy for cancer clustering |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T00%3A32%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Non-negative%20matrix%20factorization%20by%20maximizing%20correntropy%20for%20cancer%20clustering&rft.jtitle=BMC%20bioinformatics&rft.au=Wang,%20Jim%20Jing-Yan&rft.date=2013-03-24&rft.volume=14&rft.issue=1&rft.spage=107&rft.epage=107&rft.pages=107-107&rft.artnum=107&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-14-107&rft_dat=%3Cgale_pubme%3EA534518220%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1355413343&rft_id=info:pmid/23522344&rft_galeid=A534518220&rfr_iscdi=true |