A Tale of Two Matrix Factorizations

In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the row...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The American statistician 2013-11, Vol.67 (4), p.207-218
Hauptverfasser:	Fogel, Paul, Hawkins, Douglas M., Beecher, Chris, Luta, George, Young, S. Stanley
Format:	Artikel
Sprache:	eng
Schlagworte:	Discriminant analysis Geometry Latent dimensions Matrix Nonnegative matrix factorization Numerical analysis Principal component analysis Principal components analysis Singular value decomposition Statistical analysis Statistical Practice Statistics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	218
container_issue	4
container_start_page	207
container_title	The American statistician
container_volume	67
creator	Fogel, Paul Hawkins, Douglas M. Beecher, Chris Luta, George Young, S. Stanley
description	In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the rows and the columns of the original table and these matrices may be used for both prediction purposes and to gain structural understanding of the data. In some tables, the data entries are necessarily nonnegative (apart, perhaps, from some small random noise), and so the matrix factors meant to represent them should arguably also contain only nonnegative elements. This thinking, and the desire for parsimony, underlies such techniques as rotating factors in a search for "simple structure." These attempts to transform score or loading matrices of mixed sign into nonnegative, parsimonious forms are, however, indirect and at best imperfect. The recent development of nonnegative matrix factorization, or NMF, is an attractive alternative. Rather than attempt to transform a loading or score matrix of mixed signs into one with only nonnegative elements, it directly seeks matrix factors containing only nonnegative elements. The resulting factorization often leads to substantial improvements in interpretability of the factors. We illustrate this potential by synthetic examples and a real dataset. The question of exactly when NMF is effective is not fully resolved, but some indicators of its domain of success are given. It is pointed out that the NMF factors can be used in much the same way as those coming from PCA for such tasks as ordination, clustering, and prediction. Supplementary materials for this article are available online.
doi_str_mv	10.1080/00031305.2013.845607
format	Article
fullrecord	<record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_1462218237</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24591483</jstor_id><sourcerecordid>24591483</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-2742f02e00348f726e37400df4a58e975538c9a907aa9303cf81aa5a088ab6773</originalsourceid><addsrcrecordid>eNp9kE9LAzEQxYMoWKvfQGGh562Tf5vsSUqxKlS81HMYtwls2W5qsqXWT98sqx49DcP7vXnMI-SWwpSChnsA4JSDnDKgfKqFLECdkRGVXOVMcXpORj2S98wluYpxk1ZQBRuRySxbYWMz77LVwWev2IX6K1tg1flQf2NX-zZekwuHTbQ3P3NM3hePq_lzvnx7epnPlnnFpepSkmAOmE1JQjvFCsuVAFg7gVLbUknJdVViCQqx5MArpymiRNAaPwql-JhMhru74D_3NnZm4_ehTZGGioIxqhnvKTFQVfAxBuvMLtRbDEdDwfR1mN86TF-HGepItrvBtonptT8PE7KkQvOkPwx63ToftnjwoVmbDo-NDy5gW9XR8H8TTpaMbAY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1462218237</pqid></control><display><type>article</type><title>A Tale of Two Matrix Factorizations</title><source>Jstor Complete Legacy</source><source>JSTOR Mathematics & Statistics</source><creator>Fogel, Paul ; Hawkins, Douglas M. ; Beecher, Chris ; Luta, George ; Young, S. Stanley</creator><creatorcontrib>Fogel, Paul ; Hawkins, Douglas M. ; Beecher, Chris ; Luta, George ; Young, S. Stanley</creatorcontrib><description>In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the rows and the columns of the original table and these matrices may be used for both prediction purposes and to gain structural understanding of the data. In some tables, the data entries are necessarily nonnegative (apart, perhaps, from some small random noise), and so the matrix factors meant to represent them should arguably also contain only nonnegative elements. This thinking, and the desire for parsimony, underlies such techniques as rotating factors in a search for "simple structure." These attempts to transform score or loading matrices of mixed sign into nonnegative, parsimonious forms are, however, indirect and at best imperfect. The recent development of nonnegative matrix factorization, or NMF, is an attractive alternative. Rather than attempt to transform a loading or score matrix of mixed signs into one with only nonnegative elements, it directly seeks matrix factors containing only nonnegative elements. The resulting factorization often leads to substantial improvements in interpretability of the factors. We illustrate this potential by synthetic examples and a real dataset. The question of exactly when NMF is effective is not fully resolved, but some indicators of its domain of success are given. It is pointed out that the NMF factors can be used in much the same way as those coming from PCA for such tasks as ordination, clustering, and prediction. Supplementary materials for this article are available online.</description><identifier>ISSN: 0003-1305</identifier><identifier>EISSN: 1537-2731</identifier><identifier>DOI: 10.1080/00031305.2013.845607</identifier><identifier>CODEN: ASTAAJ</identifier><language>eng</language><publisher>Alexandria: Taylor & Francis Group</publisher><subject>Discriminant analysis ; Geometry ; Latent dimensions ; Matrix ; Nonnegative matrix factorization ; Numerical analysis ; Principal component analysis ; Principal components analysis ; Singular value decomposition ; Statistical analysis ; Statistical Practice ; Statistics</subject><ispartof>The American statistician, 2013-11, Vol.67 (4), p.207-218</ispartof><rights>Copyright Taylor & Francis Group, LLC 2013</rights><rights>Copyright 2013 American Statistical Association</rights><rights>Copyright Taylor & Francis Ltd. 2013</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-2742f02e00348f726e37400df4a58e975538c9a907aa9303cf81aa5a088ab6773</citedby><cites>FETCH-LOGICAL-c357t-2742f02e00348f726e37400df4a58e975538c9a907aa9303cf81aa5a088ab6773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24591483$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24591483$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,828,27901,27902,57992,57996,58225,58229</link.rule.ids></links><search><creatorcontrib>Fogel, Paul</creatorcontrib><creatorcontrib>Hawkins, Douglas M.</creatorcontrib><creatorcontrib>Beecher, Chris</creatorcontrib><creatorcontrib>Luta, George</creatorcontrib><creatorcontrib>Young, S. Stanley</creatorcontrib><title>A Tale of Two Matrix Factorizations</title><title>The American statistician</title><description>In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the rows and the columns of the original table and these matrices may be used for both prediction purposes and to gain structural understanding of the data. In some tables, the data entries are necessarily nonnegative (apart, perhaps, from some small random noise), and so the matrix factors meant to represent them should arguably also contain only nonnegative elements. This thinking, and the desire for parsimony, underlies such techniques as rotating factors in a search for "simple structure." These attempts to transform score or loading matrices of mixed sign into nonnegative, parsimonious forms are, however, indirect and at best imperfect. The recent development of nonnegative matrix factorization, or NMF, is an attractive alternative. Rather than attempt to transform a loading or score matrix of mixed signs into one with only nonnegative elements, it directly seeks matrix factors containing only nonnegative elements. The resulting factorization often leads to substantial improvements in interpretability of the factors. We illustrate this potential by synthetic examples and a real dataset. The question of exactly when NMF is effective is not fully resolved, but some indicators of its domain of success are given. It is pointed out that the NMF factors can be used in much the same way as those coming from PCA for such tasks as ordination, clustering, and prediction. Supplementary materials for this article are available online.</description><subject>Discriminant analysis</subject><subject>Geometry</subject><subject>Latent dimensions</subject><subject>Matrix</subject><subject>Nonnegative matrix factorization</subject><subject>Numerical analysis</subject><subject>Principal component analysis</subject><subject>Principal components analysis</subject><subject>Singular value decomposition</subject><subject>Statistical analysis</subject><subject>Statistical Practice</subject><subject>Statistics</subject><issn>0003-1305</issn><issn>1537-2731</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LAzEQxYMoWKvfQGGh562Tf5vsSUqxKlS81HMYtwls2W5qsqXWT98sqx49DcP7vXnMI-SWwpSChnsA4JSDnDKgfKqFLECdkRGVXOVMcXpORj2S98wluYpxk1ZQBRuRySxbYWMz77LVwWev2IX6K1tg1flQf2NX-zZekwuHTbQ3P3NM3hePq_lzvnx7epnPlnnFpepSkmAOmE1JQjvFCsuVAFg7gVLbUknJdVViCQqx5MArpymiRNAaPwql-JhMhru74D_3NnZm4_ehTZGGioIxqhnvKTFQVfAxBuvMLtRbDEdDwfR1mN86TF-HGepItrvBtonptT8PE7KkQvOkPwx63ToftnjwoVmbDo-NDy5gW9XR8H8TTpaMbAY</recordid><startdate>20131101</startdate><enddate>20131101</enddate><creator>Fogel, Paul</creator><creator>Hawkins, Douglas M.</creator><creator>Beecher, Chris</creator><creator>Luta, George</creator><creator>Young, S. Stanley</creator><general>Taylor & Francis Group</general><general>AMERICAN STATISTICAL ASSOCIATION</general><general>American Statistical Association</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20131101</creationdate><title>A Tale of Two Matrix Factorizations</title><author>Fogel, Paul ; Hawkins, Douglas M. ; Beecher, Chris ; Luta, George ; Young, S. Stanley</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-2742f02e00348f726e37400df4a58e975538c9a907aa9303cf81aa5a088ab6773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Discriminant analysis</topic><topic>Geometry</topic><topic>Latent dimensions</topic><topic>Matrix</topic><topic>Nonnegative matrix factorization</topic><topic>Numerical analysis</topic><topic>Principal component analysis</topic><topic>Principal components analysis</topic><topic>Singular value decomposition</topic><topic>Statistical analysis</topic><topic>Statistical Practice</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fogel, Paul</creatorcontrib><creatorcontrib>Hawkins, Douglas M.</creatorcontrib><creatorcontrib>Beecher, Chris</creatorcontrib><creatorcontrib>Luta, George</creatorcontrib><creatorcontrib>Young, S. Stanley</creatorcontrib><collection>CrossRef</collection><jtitle>The American statistician</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fogel, Paul</au><au>Hawkins, Douglas M.</au><au>Beecher, Chris</au><au>Luta, George</au><au>Young, S. Stanley</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Tale of Two Matrix Factorizations</atitle><jtitle>The American statistician</jtitle><date>2013-11-01</date><risdate>2013</risdate><volume>67</volume><issue>4</issue><spage>207</spage><epage>218</epage><pages>207-218</pages><issn>0003-1305</issn><eissn>1537-2731</eissn><coden>ASTAAJ</coden><abstract>In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the rows and the columns of the original table and these matrices may be used for both prediction purposes and to gain structural understanding of the data. In some tables, the data entries are necessarily nonnegative (apart, perhaps, from some small random noise), and so the matrix factors meant to represent them should arguably also contain only nonnegative elements. This thinking, and the desire for parsimony, underlies such techniques as rotating factors in a search for "simple structure." These attempts to transform score or loading matrices of mixed sign into nonnegative, parsimonious forms are, however, indirect and at best imperfect. The recent development of nonnegative matrix factorization, or NMF, is an attractive alternative. Rather than attempt to transform a loading or score matrix of mixed signs into one with only nonnegative elements, it directly seeks matrix factors containing only nonnegative elements. The resulting factorization often leads to substantial improvements in interpretability of the factors. We illustrate this potential by synthetic examples and a real dataset. The question of exactly when NMF is effective is not fully resolved, but some indicators of its domain of success are given. It is pointed out that the NMF factors can be used in much the same way as those coming from PCA for such tasks as ordination, clustering, and prediction. Supplementary materials for this article are available online.</abstract><cop>Alexandria</cop><pub>Taylor & Francis Group</pub><doi>10.1080/00031305.2013.845607</doi><tpages>12</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0003-1305
ispartof	The American statistician, 2013-11, Vol.67 (4), p.207-218
issn	0003-1305 1537-2731
language	eng
recordid	cdi_proquest_journals_1462218237
source	Jstor Complete Legacy; JSTOR Mathematics & Statistics
subjects	Discriminant analysis Geometry Latent dimensions Matrix Nonnegative matrix factorization Numerical analysis Principal component analysis Principal components analysis Singular value decomposition Statistical analysis Statistical Practice Statistics
title	A Tale of Two Matrix Factorizations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T18%3A50%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Tale%20of%20Two%20Matrix%20Factorizations&rft.jtitle=The%20American%20statistician&rft.au=Fogel,%20Paul&rft.date=2013-11-01&rft.volume=67&rft.issue=4&rft.spage=207&rft.epage=218&rft.pages=207-218&rft.issn=0003-1305&rft.eissn=1537-2731&rft.coden=ASTAAJ&rft_id=info:doi/10.1080/00031305.2013.845607&rft_dat=%3Cjstor_proqu%3E24591483%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1462218237&rft_id=info:pmid/&rft_jstor_id=24591483&rfr_iscdi=true