Sample Complexity of Dictionary Learning and Other Matrix Factorizations

Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis, non-negative matrix factorization, K-means clustering, and so on, rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2015-06, Vol.61 (6), p.3469-3486
Hauptverfasser: Gribonval, Remi, Jenatton, Rodolphe, Bach, Francis, Kleinsteuber, Martin, Seibert, Matthias
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3486
container_issue 6
container_start_page 3469
container_title IEEE transactions on information theory
container_volume 61
creator Gribonval, Remi
Jenatton, Rodolphe
Bach, Francis
Kleinsteuber, Martin
Seibert, Matthias
description Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis, non-negative matrix factorization, K-means clustering, and so on, rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity...), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to (log (n)/n) 1/2 with respect to the number of samples n for the considered matrix factorization techniques.
doi_str_mv 10.1109/TIT.2015.2424238
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_7088631</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7088631</ieee_id><sourcerecordid>3701035421</sourcerecordid><originalsourceid>FETCH-LOGICAL-c367t-66962fd7ed4e5a58b1bdb0c12cd4e876767b7876cd259d3071e12641c1e0f9603</originalsourceid><addsrcrecordid>eNo9kMFLwzAUxoMoOKd3wUvBk4fOvDRJk-OYzg0qOzjPIU1Tl7E1M-1k8683ZUPe4eM9ft_H40PoHvAIAMvn5Xw5IhjYiNA4mbhAA2AsTyVn9BINMAaRSkrFNbpp23VcKQMyQLMPvd1tbDLxvRxcd0x8nbw40znf6HBMCqtD45qvRDdVsuhWNiTvugvukEy16Xxwv7pH21t0VetNa-_OOkSf09flZJYWi7f5ZFykJuN5l3IuOamr3FbUMs1ECWVVYgPExIPIeZwyj2oqwmSV4RwsEE7BgMW15DgboqdT7kpv1C64bXxSee3UbFyo_oaxBAGU_GSRfTyxu-C_97bt1NrvQxPfU8AFZRJTISKFT5QJvm2Drf9jAau-WxW7VX236txttDycLM5a-4_nWAieQfYHiDRzeg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1684590488</pqid></control><display><type>article</type><title>Sample Complexity of Dictionary Learning and Other Matrix Factorizations</title><source>IEEE Electronic Library (IEL)</source><creator>Gribonval, Remi ; Jenatton, Rodolphe ; Bach, Francis ; Kleinsteuber, Martin ; Seibert, Matthias</creator><creatorcontrib>Gribonval, Remi ; Jenatton, Rodolphe ; Bach, Francis ; Kleinsteuber, Martin ; Seibert, Matthias</creatorcontrib><description>Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis, non-negative matrix factorization, K-means clustering, and so on, rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity...), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to (log (n)/n) 1/2 with respect to the number of samples n for the considered matrix factorization techniques.</description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2015.2424238</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Complexity theory ; Computer Science ; Dictionary learning ; Electronics ; Engineering Sciences ; Information Theory ; K-means clustering ; Machine Learning ; Mathematics ; Matrix ; non-negative matrix factorization ; Principal component analysis ; Probability distribution ; sample complexity ; Signal and Image Processing ; Signal processing ; sparse coding ; Sparse matrices ; Sparsity ; Statistics ; structured learning ; Training</subject><ispartof>IEEE transactions on information theory, 2015-06, Vol.61 (6), p.3469-3486</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Jun 2015</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c367t-66962fd7ed4e5a58b1bdb0c12cd4e876767b7876cd259d3071e12641c1e0f9603</citedby><cites>FETCH-LOGICAL-c367t-66962fd7ed4e5a58b1bdb0c12cd4e876767b7876cd259d3071e12641c1e0f9603</cites><orcidid>0000-0002-9450-8125 ; 0000-0001-8644-1058</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7088631$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7088631$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://inria.hal.science/hal-00918142$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Gribonval, Remi</creatorcontrib><creatorcontrib>Jenatton, Rodolphe</creatorcontrib><creatorcontrib>Bach, Francis</creatorcontrib><creatorcontrib>Kleinsteuber, Martin</creatorcontrib><creatorcontrib>Seibert, Matthias</creatorcontrib><title>Sample Complexity of Dictionary Learning and Other Matrix Factorizations</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description>Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis, non-negative matrix factorization, K-means clustering, and so on, rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity...), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to (log (n)/n) 1/2 with respect to the number of samples n for the considered matrix factorization techniques.</description><subject>Complexity theory</subject><subject>Computer Science</subject><subject>Dictionary learning</subject><subject>Electronics</subject><subject>Engineering Sciences</subject><subject>Information Theory</subject><subject>K-means clustering</subject><subject>Machine Learning</subject><subject>Mathematics</subject><subject>Matrix</subject><subject>non-negative matrix factorization</subject><subject>Principal component analysis</subject><subject>Probability distribution</subject><subject>sample complexity</subject><subject>Signal and Image Processing</subject><subject>Signal processing</subject><subject>sparse coding</subject><subject>Sparse matrices</subject><subject>Sparsity</subject><subject>Statistics</subject><subject>structured learning</subject><subject>Training</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMFLwzAUxoMoOKd3wUvBk4fOvDRJk-OYzg0qOzjPIU1Tl7E1M-1k8683ZUPe4eM9ft_H40PoHvAIAMvn5Xw5IhjYiNA4mbhAA2AsTyVn9BINMAaRSkrFNbpp23VcKQMyQLMPvd1tbDLxvRxcd0x8nbw40znf6HBMCqtD45qvRDdVsuhWNiTvugvukEy16Xxwv7pH21t0VetNa-_OOkSf09flZJYWi7f5ZFykJuN5l3IuOamr3FbUMs1ECWVVYgPExIPIeZwyj2oqwmSV4RwsEE7BgMW15DgboqdT7kpv1C64bXxSee3UbFyo_oaxBAGU_GSRfTyxu-C_97bt1NrvQxPfU8AFZRJTISKFT5QJvm2Drf9jAau-WxW7VX236txttDycLM5a-4_nWAieQfYHiDRzeg</recordid><startdate>20150601</startdate><enddate>20150601</enddate><creator>Gribonval, Remi</creator><creator>Jenatton, Rodolphe</creator><creator>Bach, Francis</creator><creator>Kleinsteuber, Martin</creator><creator>Seibert, Matthias</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-9450-8125</orcidid><orcidid>https://orcid.org/0000-0001-8644-1058</orcidid></search><sort><creationdate>20150601</creationdate><title>Sample Complexity of Dictionary Learning and Other Matrix Factorizations</title><author>Gribonval, Remi ; Jenatton, Rodolphe ; Bach, Francis ; Kleinsteuber, Martin ; Seibert, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c367t-66962fd7ed4e5a58b1bdb0c12cd4e876767b7876cd259d3071e12641c1e0f9603</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Complexity theory</topic><topic>Computer Science</topic><topic>Dictionary learning</topic><topic>Electronics</topic><topic>Engineering Sciences</topic><topic>Information Theory</topic><topic>K-means clustering</topic><topic>Machine Learning</topic><topic>Mathematics</topic><topic>Matrix</topic><topic>non-negative matrix factorization</topic><topic>Principal component analysis</topic><topic>Probability distribution</topic><topic>sample complexity</topic><topic>Signal and Image Processing</topic><topic>Signal processing</topic><topic>sparse coding</topic><topic>Sparse matrices</topic><topic>Sparsity</topic><topic>Statistics</topic><topic>structured learning</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gribonval, Remi</creatorcontrib><creatorcontrib>Jenatton, Rodolphe</creatorcontrib><creatorcontrib>Bach, Francis</creatorcontrib><creatorcontrib>Kleinsteuber, Martin</creatorcontrib><creatorcontrib>Seibert, Matthias</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gribonval, Remi</au><au>Jenatton, Rodolphe</au><au>Bach, Francis</au><au>Kleinsteuber, Martin</au><au>Seibert, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sample Complexity of Dictionary Learning and Other Matrix Factorizations</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2015-06-01</date><risdate>2015</risdate><volume>61</volume><issue>6</issue><spage>3469</spage><epage>3486</epage><pages>3469-3486</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract>Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis, non-negative matrix factorization, K-means clustering, and so on, rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity...), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to (log (n)/n) 1/2 with respect to the number of samples n for the considered matrix factorization techniques.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2015.2424238</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-9450-8125</orcidid><orcidid>https://orcid.org/0000-0001-8644-1058</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9448
ispartof IEEE transactions on information theory, 2015-06, Vol.61 (6), p.3469-3486
issn 0018-9448
1557-9654
language eng
recordid cdi_ieee_primary_7088631
source IEEE Electronic Library (IEL)
subjects Complexity theory
Computer Science
Dictionary learning
Electronics
Engineering Sciences
Information Theory
K-means clustering
Machine Learning
Mathematics
Matrix
non-negative matrix factorization
Principal component analysis
Probability distribution
sample complexity
Signal and Image Processing
Signal processing
sparse coding
Sparse matrices
Sparsity
Statistics
structured learning
Training
title Sample Complexity of Dictionary Learning and Other Matrix Factorizations
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T08%3A28%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sample%20Complexity%20of%20Dictionary%20Learning%20and%20Other%20Matrix%20Factorizations&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Gribonval,%20Remi&rft.date=2015-06-01&rft.volume=61&rft.issue=6&rft.spage=3469&rft.epage=3486&rft.pages=3469-3486&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2015.2424238&rft_dat=%3Cproquest_RIE%3E3701035421%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1684590488&rft_id=info:pmid/&rft_ieee_id=7088631&rfr_iscdi=true