Structured Matrices and Their Application in Neural Networks: A Survey

Modern neural network architectures are becoming larger and deeper, with increasing computational resources needed for training and inference. One approach toward handling this increased resource consumption is to use structured weight matrices. By exploiting structures in weight matrices, the compu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	New generation computing 2023-09, Vol.41 (3), p.697-722
Hauptverfasser:	Kissel, Matthias, Diepold, Klaus
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Benchmarks Computer architecture Computer Hardware Computer Science Computer Systems Organization and Communication Networks Neural networks Software Engineering/Programming and Operating Systems Sparse matrices Structured matrices
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	722
container_issue	3
container_start_page	697
container_title	New generation computing
container_volume	41
creator	Kissel, Matthias Diepold, Klaus
description	Modern neural network architectures are becoming larger and deeper, with increasing computational resources needed for training and inference. One approach toward handling this increased resource consumption is to use structured weight matrices. By exploiting structures in weight matrices, the computational complexity for propagating information through the network can be reduced. However, choosing the right structure is not trivial, especially since there are many different matrix structures and structure classes. In this paper, we give an overview over the four main matrix structure classes, namely semiseparable matrices, matrices of low displacement rank, hierarchical matrices and products of sparse matrices. We recapitulate the definitions of each structure class, present special structure subclasses, and provide references to research papers in which the structures are used in the domain of neural networks. We present two benchmarks comparing the classes. First, we benchmark the error for approximating different test matrices. Second, we compare the prediction performance of neural networks in which the weight matrix of the last layer is replaced by structured matrices. After presenting the benchmark results, we discuss open research questions related to the use of structured matrices in neural networks and highlight future research directions.
doi_str_mv	10.1007/s00354-023-00226-1
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2865505930</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2865505930</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-4740e35e8cc655e72492208b23db88533caa01180b796b7042e4f47edc21af273</originalsourceid><addsrcrecordid>eNp9kDtPwzAYRS0EEqXwB5gsMRs-v2KXrap4SQWGltlyHAdSShJsB9R_jyFIbEx3Ofde6SB0SuGcAqiLCMClIMA4AWCsIHQPTajWjCiQch9NgGlNeMHlITqKcZPxggs2QderFAaXhuArfG9TaJyP2LYVXr_4JuB5328bZ1PTtbhp8YMfgt3mSJ9deI2XeI5XQ_jwu2N0UNtt9Ce_OUVP11frxS1ZPt7cLeZL4vJfIkIJ8Fx67VwhpVdMzBgDXTJelVpLzp21QKmGUs2KUoFgXtRC-coxamum-BSdjbt96N4HH5PZdENo86VhOk-CnHHIFBspF7oYg69NH5o3G3aGgvn2ZUZfJvsyP74MzSU-lmKG22cf_qb_aX0BtghrxQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2865505930</pqid></control><display><type>article</type><title>Structured Matrices and Their Application in Neural Networks: A Survey</title><source>SpringerLink Journals - AutoHoldings</source><creator>Kissel, Matthias ; Diepold, Klaus</creator><creatorcontrib>Kissel, Matthias ; Diepold, Klaus</creatorcontrib><description>Modern neural network architectures are becoming larger and deeper, with increasing computational resources needed for training and inference. One approach toward handling this increased resource consumption is to use structured weight matrices. By exploiting structures in weight matrices, the computational complexity for propagating information through the network can be reduced. However, choosing the right structure is not trivial, especially since there are many different matrix structures and structure classes. In this paper, we give an overview over the four main matrix structure classes, namely semiseparable matrices, matrices of low displacement rank, hierarchical matrices and products of sparse matrices. We recapitulate the definitions of each structure class, present special structure subclasses, and provide references to research papers in which the structures are used in the domain of neural networks. We present two benchmarks comparing the classes. First, we benchmark the error for approximating different test matrices. Second, we compare the prediction performance of neural networks in which the weight matrix of the last layer is replaced by structured matrices. After presenting the benchmark results, we discuss open research questions related to the use of structured matrices in neural networks and highlight future research directions.</description><identifier>ISSN: 0288-3635</identifier><identifier>EISSN: 1882-7055</identifier><identifier>DOI: 10.1007/s00354-023-00226-1</identifier><language>eng</language><publisher>Tokyo: Springer Japan</publisher><subject>Artificial Intelligence ; Benchmarks ; Computer architecture ; Computer Hardware ; Computer Science ; Computer Systems Organization and Communication Networks ; Neural networks ; Software Engineering/Programming and Operating Systems ; Sparse matrices ; Structured matrices</subject><ispartof>New generation computing, 2023-09, Vol.41 (3), p.697-722</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-4740e35e8cc655e72492208b23db88533caa01180b796b7042e4f47edc21af273</citedby><cites>FETCH-LOGICAL-c363t-4740e35e8cc655e72492208b23db88533caa01180b796b7042e4f47edc21af273</cites><orcidid>0000-0003-4089-3934</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00354-023-00226-1$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00354-023-00226-1$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Kissel, Matthias</creatorcontrib><creatorcontrib>Diepold, Klaus</creatorcontrib><title>Structured Matrices and Their Application in Neural Networks: A Survey</title><title>New generation computing</title><addtitle>New Gener. Comput</addtitle><description>Modern neural network architectures are becoming larger and deeper, with increasing computational resources needed for training and inference. One approach toward handling this increased resource consumption is to use structured weight matrices. By exploiting structures in weight matrices, the computational complexity for propagating information through the network can be reduced. However, choosing the right structure is not trivial, especially since there are many different matrix structures and structure classes. In this paper, we give an overview over the four main matrix structure classes, namely semiseparable matrices, matrices of low displacement rank, hierarchical matrices and products of sparse matrices. We recapitulate the definitions of each structure class, present special structure subclasses, and provide references to research papers in which the structures are used in the domain of neural networks. We present two benchmarks comparing the classes. First, we benchmark the error for approximating different test matrices. Second, we compare the prediction performance of neural networks in which the weight matrix of the last layer is replaced by structured matrices. After presenting the benchmark results, we discuss open research questions related to the use of structured matrices in neural networks and highlight future research directions.</description><subject>Artificial Intelligence</subject><subject>Benchmarks</subject><subject>Computer architecture</subject><subject>Computer Hardware</subject><subject>Computer Science</subject><subject>Computer Systems Organization and Communication Networks</subject><subject>Neural networks</subject><subject>Software Engineering/Programming and Operating Systems</subject><subject>Sparse matrices</subject><subject>Structured matrices</subject><issn>0288-3635</issn><issn>1882-7055</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9kDtPwzAYRS0EEqXwB5gsMRs-v2KXrap4SQWGltlyHAdSShJsB9R_jyFIbEx3Ofde6SB0SuGcAqiLCMClIMA4AWCsIHQPTajWjCiQch9NgGlNeMHlITqKcZPxggs2QderFAaXhuArfG9TaJyP2LYVXr_4JuB5328bZ1PTtbhp8YMfgt3mSJ9deI2XeI5XQ_jwu2N0UNtt9Ce_OUVP11frxS1ZPt7cLeZL4vJfIkIJ8Fx67VwhpVdMzBgDXTJelVpLzp21QKmGUs2KUoFgXtRC-coxamum-BSdjbt96N4HH5PZdENo86VhOk-CnHHIFBspF7oYg69NH5o3G3aGgvn2ZUZfJvsyP74MzSU-lmKG22cf_qb_aX0BtghrxQ</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Kissel, Matthias</creator><creator>Diepold, Klaus</creator><general>Springer Japan</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-4089-3934</orcidid></search><sort><creationdate>20230901</creationdate><title>Structured Matrices and Their Application in Neural Networks: A Survey</title><author>Kissel, Matthias ; Diepold, Klaus</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-4740e35e8cc655e72492208b23db88533caa01180b796b7042e4f47edc21af273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Benchmarks</topic><topic>Computer architecture</topic><topic>Computer Hardware</topic><topic>Computer Science</topic><topic>Computer Systems Organization and Communication Networks</topic><topic>Neural networks</topic><topic>Software Engineering/Programming and Operating Systems</topic><topic>Sparse matrices</topic><topic>Structured matrices</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kissel, Matthias</creatorcontrib><creatorcontrib>Diepold, Klaus</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><jtitle>New generation computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kissel, Matthias</au><au>Diepold, Klaus</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Structured Matrices and Their Application in Neural Networks: A Survey</atitle><jtitle>New generation computing</jtitle><stitle>New Gener. Comput</stitle><date>2023-09-01</date><risdate>2023</risdate><volume>41</volume><issue>3</issue><spage>697</spage><epage>722</epage><pages>697-722</pages><issn>0288-3635</issn><eissn>1882-7055</eissn><abstract>Modern neural network architectures are becoming larger and deeper, with increasing computational resources needed for training and inference. One approach toward handling this increased resource consumption is to use structured weight matrices. By exploiting structures in weight matrices, the computational complexity for propagating information through the network can be reduced. However, choosing the right structure is not trivial, especially since there are many different matrix structures and structure classes. In this paper, we give an overview over the four main matrix structure classes, namely semiseparable matrices, matrices of low displacement rank, hierarchical matrices and products of sparse matrices. We recapitulate the definitions of each structure class, present special structure subclasses, and provide references to research papers in which the structures are used in the domain of neural networks. We present two benchmarks comparing the classes. First, we benchmark the error for approximating different test matrices. Second, we compare the prediction performance of neural networks in which the weight matrix of the last layer is replaced by structured matrices. After presenting the benchmark results, we discuss open research questions related to the use of structured matrices in neural networks and highlight future research directions.</abstract><cop>Tokyo</cop><pub>Springer Japan</pub><doi>10.1007/s00354-023-00226-1</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0003-4089-3934</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0288-3635
ispartof	New generation computing, 2023-09, Vol.41 (3), p.697-722
issn	0288-3635 1882-7055
language	eng
recordid	cdi_proquest_journals_2865505930
source	SpringerLink Journals - AutoHoldings
subjects	Artificial Intelligence Benchmarks Computer architecture Computer Hardware Computer Science Computer Systems Organization and Communication Networks Neural networks Software Engineering/Programming and Operating Systems Sparse matrices Structured matrices
title	Structured Matrices and Their Application in Neural Networks: A Survey
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T05%3A44%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Structured%20Matrices%20and%20Their%20Application%20in%20Neural%20Networks:%20A%20Survey&rft.jtitle=New%20generation%20computing&rft.au=Kissel,%20Matthias&rft.date=2023-09-01&rft.volume=41&rft.issue=3&rft.spage=697&rft.epage=722&rft.pages=697-722&rft.issn=0288-3635&rft.eissn=1882-7055&rft_id=info:doi/10.1007/s00354-023-00226-1&rft_dat=%3Cproquest_cross%3E2865505930%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2865505930&rft_id=info:pmid/&rfr_iscdi=true