Tensor-tensor algebra for optimal representation and compression of multiway data

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the National Academy of Sciences - PNAS 2021-07, Vol.118 (28), p.1-12
Hauptverfasser: Kilmer, Misha E., Horesh, Lior, Avron, Haim, Newman, Elizabeth
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 12
container_issue 28
container_start_page 1
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 118
creator Kilmer, Misha E.
Horesh, Lior
Avron, Haim
Newman, Elizabeth
description With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.
doi_str_mv 10.1073/pnas.2015851118
format Article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8285895</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>27052425</jstor_id><sourcerecordid>27052425</sourcerecordid><originalsourceid>FETCH-LOGICAL-c509t-2a77178e2d8ff79800268329061253f6fe3210854cbb70183abe7f3e5e6a9b03</originalsourceid><addsrcrecordid>eNpdkc1r3DAQxUVp6G42PefUYMglFyejL0u6FErIR2EhBPYuZK-08WJbjiS35L-vNptu2pxGmvnp8UYPoVMMlxgEvRoHEy8JYC45xlh-QnMMCpcVU_AZzQGIKCUjbIaOY9wCgOISvqAZZYQywGyOHld2iD6U6bUUptvYOpjC5bMfU9ubrgh2DDbaIZnU-qEww7pofL_rxd3du6KfutT-Ni_F2iRzgo6c6aL9-lYXaHV7s7q-L5cPdz-vfyzLhoNKJTFCYCEtWUvnhJLZayUpUVBhwqmrnKUEg-SsqWsBWFJTW-Go5bYyqga6QN_3suNU93bdZH_BdHoM2XN40d60-v_J0D7pjf-lJZFcKp4FLt4Egn-ebEy6b2Nju84M1k9RE85UJQVhKqPnH9Ctn8KQt8sU51hwRXGmrvZUE3yMwbqDGQx6l5bepaXf08ovzv7d4cD_jScD3_bANiYfDnMigBOWv-kPiY2bAQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2555175931</pqid></control><display><type>article</type><title>Tensor-tensor algebra for optimal representation and compression of multiway data</title><source>Jstor Complete Legacy</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Kilmer, Misha E. ; Horesh, Lior ; Avron, Haim ; Newman, Elizabeth</creator><creatorcontrib>Kilmer, Misha E. ; Horesh, Lior ; Avron, Haim ; Newman, Elizabeth</creatorcontrib><description>With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.2015851118</identifier><identifier>PMID: 34234014</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Compressibility ; Compression ; Datasets ; Distillation ; Feature extraction ; Learning algorithms ; Machine learning ; Mathematical analysis ; Matrices (mathematics) ; Matrix algebra ; Optimization ; Physical Sciences ; Representations ; Singular value decomposition ; Tensors</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2021-07, Vol.118 (28), p.1-12</ispartof><rights>Copyright National Academy of Sciences Jul 13, 2021</rights><rights>2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c509t-2a77178e2d8ff79800268329061253f6fe3210854cbb70183abe7f3e5e6a9b03</citedby><cites>FETCH-LOGICAL-c509t-2a77178e2d8ff79800268329061253f6fe3210854cbb70183abe7f3e5e6a9b03</cites><orcidid>0000-0003-4249-4742 ; 0000-0001-6350-0238</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/27052425$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/27052425$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,725,778,782,801,883,27907,27908,53774,53776,58000,58233</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34234014$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kilmer, Misha E.</creatorcontrib><creatorcontrib>Horesh, Lior</creatorcontrib><creatorcontrib>Avron, Haim</creatorcontrib><creatorcontrib>Newman, Elizabeth</creatorcontrib><title>Tensor-tensor algebra for optimal representation and compression of multiway data</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.</description><subject>Compressibility</subject><subject>Compression</subject><subject>Datasets</subject><subject>Distillation</subject><subject>Feature extraction</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Mathematical analysis</subject><subject>Matrices (mathematics)</subject><subject>Matrix algebra</subject><subject>Optimization</subject><subject>Physical Sciences</subject><subject>Representations</subject><subject>Singular value decomposition</subject><subject>Tensors</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpdkc1r3DAQxUVp6G42PefUYMglFyejL0u6FErIR2EhBPYuZK-08WJbjiS35L-vNptu2pxGmvnp8UYPoVMMlxgEvRoHEy8JYC45xlh-QnMMCpcVU_AZzQGIKCUjbIaOY9wCgOISvqAZZYQywGyOHld2iD6U6bUUptvYOpjC5bMfU9ubrgh2DDbaIZnU-qEww7pofL_rxd3du6KfutT-Ni_F2iRzgo6c6aL9-lYXaHV7s7q-L5cPdz-vfyzLhoNKJTFCYCEtWUvnhJLZayUpUVBhwqmrnKUEg-SsqWsBWFJTW-Go5bYyqga6QN_3suNU93bdZH_BdHoM2XN40d60-v_J0D7pjf-lJZFcKp4FLt4Egn-ebEy6b2Nju84M1k9RE85UJQVhKqPnH9Ctn8KQt8sU51hwRXGmrvZUE3yMwbqDGQx6l5bepaXf08ovzv7d4cD_jScD3_bANiYfDnMigBOWv-kPiY2bAQ</recordid><startdate>20210713</startdate><enddate>20210713</enddate><creator>Kilmer, Misha E.</creator><creator>Horesh, Lior</creator><creator>Avron, Haim</creator><creator>Newman, Elizabeth</creator><general>National Academy of Sciences</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-4249-4742</orcidid><orcidid>https://orcid.org/0000-0001-6350-0238</orcidid></search><sort><creationdate>20210713</creationdate><title>Tensor-tensor algebra for optimal representation and compression of multiway data</title><author>Kilmer, Misha E. ; Horesh, Lior ; Avron, Haim ; Newman, Elizabeth</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c509t-2a77178e2d8ff79800268329061253f6fe3210854cbb70183abe7f3e5e6a9b03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Compressibility</topic><topic>Compression</topic><topic>Datasets</topic><topic>Distillation</topic><topic>Feature extraction</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Mathematical analysis</topic><topic>Matrices (mathematics)</topic><topic>Matrix algebra</topic><topic>Optimization</topic><topic>Physical Sciences</topic><topic>Representations</topic><topic>Singular value decomposition</topic><topic>Tensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kilmer, Misha E.</creatorcontrib><creatorcontrib>Horesh, Lior</creatorcontrib><creatorcontrib>Avron, Haim</creatorcontrib><creatorcontrib>Newman, Elizabeth</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kilmer, Misha E.</au><au>Horesh, Lior</au><au>Avron, Haim</au><au>Newman, Elizabeth</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Tensor-tensor algebra for optimal representation and compression of multiway data</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2021-07-13</date><risdate>2021</risdate><volume>118</volume><issue>28</issue><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>34234014</pmid><doi>10.1073/pnas.2015851118</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-4249-4742</orcidid><orcidid>https://orcid.org/0000-0001-6350-0238</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 2021-07, Vol.118 (28), p.1-12
issn 0027-8424
1091-6490
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8285895
source Jstor Complete Legacy; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Compressibility
Compression
Datasets
Distillation
Feature extraction
Learning algorithms
Machine learning
Mathematical analysis
Matrices (mathematics)
Matrix algebra
Optimization
Physical Sciences
Representations
Singular value decomposition
Tensors
title Tensor-tensor algebra for optimal representation and compression of multiway data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T02%3A45%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Tensor-tensor%20algebra%20for%20optimal%20representation%20and%20compression%20of%20multiway%20data&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Kilmer,%20Misha%20E.&rft.date=2021-07-13&rft.volume=118&rft.issue=28&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.2015851118&rft_dat=%3Cjstor_pubme%3E27052425%3C/jstor_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2555175931&rft_id=info:pmid/34234014&rft_jstor_id=27052425&rfr_iscdi=true