A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network

Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2020-06, Vol.67 (6), p.1912-1924
Hauptverfasser: Shahshahani, Seyed Mohamad Reza, Mahdiani, Hamid Reza
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1924
container_issue 6
container_start_page 1912
container_title IEEE transactions on circuits and systems. I, Regular papers
container_volume 67
creator Shahshahani, Seyed Mohamad Reza
Mahdiani, Hamid Reza
description Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an 8\times 8 matrix.
doi_str_mv 10.1109/TCSI.2020.2973249
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2408657618</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9007016</ieee_id><sourcerecordid>2408657618</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</originalsourceid><addsrcrecordid>eNo9kE1LxDAQhoso-PkDxEvAg6euk6Rpk-O6fuMXrHotaTrZVncbnXQRwR_vlhVP88I87ww8SXLIYcQ5mNPnyfRmJEDASJhCisxsJDtcKZ2ChnxzyJlJtRR6O9mN8Q1AGJB8J_kZs-t21qRPSD7QwnYO2dTZua3mq9BYwjq9x0WgbzZ9PWdPFBzGGIiNyTVtj65fErIzG7FmoWO31oWqZeP5LFDbNwtmu3q17V2DdBLZNFDfdjP2gP1XoPf9ZMvbecSDv7mXvFxePE-u07vHq5vJ-C51wsg-zXyNCitbgZO-ygqBCiC36KVWhVeyQlUom9VGe42F9pnmpkBruLe6AqPlXnK8vvtB4XOJsS_fwpK61ctSZKBzVeR8oPiachRiJPTlB7ULS98lh3KQXA6Sy0Fy-Sd51Tlad1pE_OcNQAE8l78qKHk9</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2408657618</pqid></control><display><type>article</type><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><source>IEEE Electronic Library (IEL)</source><creator>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</creator><creatorcontrib>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</creatorcontrib><description>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;8\times 8 &lt;/tex-math&gt;&lt;/inline-formula&gt; matrix.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2020.2973249</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Applications programs ; ASIC ; Batcher’s sorting network ; brain-computer interface ; Computer architecture ; Computer memory ; Eigenvalues ; independent component analysis ; Jacobi EVD/SVD ; Jacobian matrices ; Microprocessors ; Mobile computing ; motor imagery ; multi-stage interconnection network ; Parallel processing ; Principal component analysis ; Scalability ; shared-memory architecture ; Signal processing ; Signal processing algorithms ; Singular value decomposition ; Sorting algorithms ; Symmetric matrices</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2020-06, Vol.67 (6), p.1912-1924</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</citedby><cites>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</cites><orcidid>0000-0002-7237-4234 ; 0000-0002-6840-1033</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9007016$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9007016$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shahshahani, Seyed Mohamad Reza</creatorcontrib><creatorcontrib>Mahdiani, Hamid Reza</creatorcontrib><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;8\times 8 &lt;/tex-math&gt;&lt;/inline-formula&gt; matrix.</description><subject>Applications programs</subject><subject>ASIC</subject><subject>Batcher’s sorting network</subject><subject>brain-computer interface</subject><subject>Computer architecture</subject><subject>Computer memory</subject><subject>Eigenvalues</subject><subject>independent component analysis</subject><subject>Jacobi EVD/SVD</subject><subject>Jacobian matrices</subject><subject>Microprocessors</subject><subject>Mobile computing</subject><subject>motor imagery</subject><subject>multi-stage interconnection network</subject><subject>Parallel processing</subject><subject>Principal component analysis</subject><subject>Scalability</subject><subject>shared-memory architecture</subject><subject>Signal processing</subject><subject>Signal processing algorithms</subject><subject>Singular value decomposition</subject><subject>Sorting algorithms</subject><subject>Symmetric matrices</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LxDAQhoso-PkDxEvAg6euk6Rpk-O6fuMXrHotaTrZVncbnXQRwR_vlhVP88I87ww8SXLIYcQ5mNPnyfRmJEDASJhCisxsJDtcKZ2ChnxzyJlJtRR6O9mN8Q1AGJB8J_kZs-t21qRPSD7QwnYO2dTZua3mq9BYwjq9x0WgbzZ9PWdPFBzGGIiNyTVtj65fErIzG7FmoWO31oWqZeP5LFDbNwtmu3q17V2DdBLZNFDfdjP2gP1XoPf9ZMvbecSDv7mXvFxePE-u07vHq5vJ-C51wsg-zXyNCitbgZO-ygqBCiC36KVWhVeyQlUom9VGe42F9pnmpkBruLe6AqPlXnK8vvtB4XOJsS_fwpK61ctSZKBzVeR8oPiachRiJPTlB7ULS98lh3KQXA6Sy0Fy-Sd51Tlad1pE_OcNQAE8l78qKHk9</recordid><startdate>20200601</startdate><enddate>20200601</enddate><creator>Shahshahani, Seyed Mohamad Reza</creator><creator>Mahdiani, Hamid Reza</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-7237-4234</orcidid><orcidid>https://orcid.org/0000-0002-6840-1033</orcidid></search><sort><creationdate>20200601</creationdate><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><author>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Applications programs</topic><topic>ASIC</topic><topic>Batcher’s sorting network</topic><topic>brain-computer interface</topic><topic>Computer architecture</topic><topic>Computer memory</topic><topic>Eigenvalues</topic><topic>independent component analysis</topic><topic>Jacobi EVD/SVD</topic><topic>Jacobian matrices</topic><topic>Microprocessors</topic><topic>Mobile computing</topic><topic>motor imagery</topic><topic>multi-stage interconnection network</topic><topic>Parallel processing</topic><topic>Principal component analysis</topic><topic>Scalability</topic><topic>shared-memory architecture</topic><topic>Signal processing</topic><topic>Signal processing algorithms</topic><topic>Singular value decomposition</topic><topic>Sorting algorithms</topic><topic>Symmetric matrices</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shahshahani, Seyed Mohamad Reza</creatorcontrib><creatorcontrib>Mahdiani, Hamid Reza</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shahshahani, Seyed Mohamad Reza</au><au>Mahdiani, Hamid Reza</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2020-06-01</date><risdate>2020</risdate><volume>67</volume><issue>6</issue><spage>1912</spage><epage>1924</epage><pages>1912-1924</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;8\times 8 &lt;/tex-math&gt;&lt;/inline-formula&gt; matrix.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2020.2973249</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-7237-4234</orcidid><orcidid>https://orcid.org/0000-0002-6840-1033</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1549-8328
ispartof IEEE transactions on circuits and systems. I, Regular papers, 2020-06, Vol.67 (6), p.1912-1924
issn 1549-8328
1558-0806
language eng
recordid cdi_proquest_journals_2408657618
source IEEE Electronic Library (IEL)
subjects Applications programs
ASIC
Batcher’s sorting network
brain-computer interface
Computer architecture
Computer memory
Eigenvalues
independent component analysis
Jacobi EVD/SVD
Jacobian matrices
Microprocessors
Mobile computing
motor imagery
multi-stage interconnection network
Parallel processing
Principal component analysis
Scalability
shared-memory architecture
Signal processing
Signal processing algorithms
Singular value decomposition
Sorting algorithms
Symmetric matrices
title A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A07%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20High-Performance%20Scalable%20Shared-Memory%20SVD%20Processor%20Architecture%20Based%20on%20Jacobi%20Algorithm%20and%20Batcher's%20Sorting%20Network&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Shahshahani,%20Seyed%20Mohamad%20Reza&rft.date=2020-06-01&rft.volume=67&rft.issue=6&rft.spage=1912&rft.epage=1924&rft.pages=1912-1924&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2020.2973249&rft_dat=%3Cproquest_RIE%3E2408657618%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2408657618&rft_id=info:pmid/&rft_ieee_id=9007016&rfr_iscdi=true