A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network
Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2020-06, Vol.67 (6), p.1912-1924 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1924 |
---|---|
container_issue | 6 |
container_start_page | 1912 |
container_title | IEEE transactions on circuits and systems. I, Regular papers |
container_volume | 67 |
creator | Shahshahani, Seyed Mohamad Reza Mahdiani, Hamid Reza |
description | Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an 8\times 8 matrix. |
doi_str_mv | 10.1109/TCSI.2020.2973249 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2408657618</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9007016</ieee_id><sourcerecordid>2408657618</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</originalsourceid><addsrcrecordid>eNo9kE1LxDAQhoso-PkDxEvAg6euk6Rpk-O6fuMXrHotaTrZVncbnXQRwR_vlhVP88I87ww8SXLIYcQ5mNPnyfRmJEDASJhCisxsJDtcKZ2ChnxzyJlJtRR6O9mN8Q1AGJB8J_kZs-t21qRPSD7QwnYO2dTZua3mq9BYwjq9x0WgbzZ9PWdPFBzGGIiNyTVtj65fErIzG7FmoWO31oWqZeP5LFDbNwtmu3q17V2DdBLZNFDfdjP2gP1XoPf9ZMvbecSDv7mXvFxePE-u07vHq5vJ-C51wsg-zXyNCitbgZO-ygqBCiC36KVWhVeyQlUom9VGe42F9pnmpkBruLe6AqPlXnK8vvtB4XOJsS_fwpK61ctSZKBzVeR8oPiachRiJPTlB7ULS98lh3KQXA6Sy0Fy-Sd51Tlad1pE_OcNQAE8l78qKHk9</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2408657618</pqid></control><display><type>article</type><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><source>IEEE Electronic Library (IEL)</source><creator>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</creator><creatorcontrib>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</creatorcontrib><description>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an <inline-formula> <tex-math notation="LaTeX">8\times 8 </tex-math></inline-formula> matrix.</description><identifier>ISSN: 1549-8328</identifier><identifier>EISSN: 1558-0806</identifier><identifier>DOI: 10.1109/TCSI.2020.2973249</identifier><identifier>CODEN: ITCSCH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Applications programs ; ASIC ; Batcher’s sorting network ; brain-computer interface ; Computer architecture ; Computer memory ; Eigenvalues ; independent component analysis ; Jacobi EVD/SVD ; Jacobian matrices ; Microprocessors ; Mobile computing ; motor imagery ; multi-stage interconnection network ; Parallel processing ; Principal component analysis ; Scalability ; shared-memory architecture ; Signal processing ; Signal processing algorithms ; Singular value decomposition ; Sorting algorithms ; Symmetric matrices</subject><ispartof>IEEE transactions on circuits and systems. I, Regular papers, 2020-06, Vol.67 (6), p.1912-1924</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</citedby><cites>FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</cites><orcidid>0000-0002-7237-4234 ; 0000-0002-6840-1033</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9007016$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9007016$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shahshahani, Seyed Mohamad Reza</creatorcontrib><creatorcontrib>Mahdiani, Hamid Reza</creatorcontrib><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><title>IEEE transactions on circuits and systems. I, Regular papers</title><addtitle>TCSI</addtitle><description>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an <inline-formula> <tex-math notation="LaTeX">8\times 8 </tex-math></inline-formula> matrix.</description><subject>Applications programs</subject><subject>ASIC</subject><subject>Batcher’s sorting network</subject><subject>brain-computer interface</subject><subject>Computer architecture</subject><subject>Computer memory</subject><subject>Eigenvalues</subject><subject>independent component analysis</subject><subject>Jacobi EVD/SVD</subject><subject>Jacobian matrices</subject><subject>Microprocessors</subject><subject>Mobile computing</subject><subject>motor imagery</subject><subject>multi-stage interconnection network</subject><subject>Parallel processing</subject><subject>Principal component analysis</subject><subject>Scalability</subject><subject>shared-memory architecture</subject><subject>Signal processing</subject><subject>Signal processing algorithms</subject><subject>Singular value decomposition</subject><subject>Sorting algorithms</subject><subject>Symmetric matrices</subject><issn>1549-8328</issn><issn>1558-0806</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LxDAQhoso-PkDxEvAg6euk6Rpk-O6fuMXrHotaTrZVncbnXQRwR_vlhVP88I87ww8SXLIYcQ5mNPnyfRmJEDASJhCisxsJDtcKZ2ChnxzyJlJtRR6O9mN8Q1AGJB8J_kZs-t21qRPSD7QwnYO2dTZua3mq9BYwjq9x0WgbzZ9PWdPFBzGGIiNyTVtj65fErIzG7FmoWO31oWqZeP5LFDbNwtmu3q17V2DdBLZNFDfdjP2gP1XoPf9ZMvbecSDv7mXvFxePE-u07vHq5vJ-C51wsg-zXyNCitbgZO-ygqBCiC36KVWhVeyQlUom9VGe42F9pnmpkBruLe6AqPlXnK8vvtB4XOJsS_fwpK61ctSZKBzVeR8oPiachRiJPTlB7ULS98lh3KQXA6Sy0Fy-Sd51Tlad1pE_OcNQAE8l78qKHk9</recordid><startdate>20200601</startdate><enddate>20200601</enddate><creator>Shahshahani, Seyed Mohamad Reza</creator><creator>Mahdiani, Hamid Reza</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-7237-4234</orcidid><orcidid>https://orcid.org/0000-0002-6840-1033</orcidid></search><sort><creationdate>20200601</creationdate><title>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</title><author>Shahshahani, Seyed Mohamad Reza ; Mahdiani, Hamid Reza</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-4fde5ebab0c3fb472e5006aef3857f53be575a4d98f8e78f48197ea91fa8b0983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Applications programs</topic><topic>ASIC</topic><topic>Batcher’s sorting network</topic><topic>brain-computer interface</topic><topic>Computer architecture</topic><topic>Computer memory</topic><topic>Eigenvalues</topic><topic>independent component analysis</topic><topic>Jacobi EVD/SVD</topic><topic>Jacobian matrices</topic><topic>Microprocessors</topic><topic>Mobile computing</topic><topic>motor imagery</topic><topic>multi-stage interconnection network</topic><topic>Parallel processing</topic><topic>Principal component analysis</topic><topic>Scalability</topic><topic>shared-memory architecture</topic><topic>Signal processing</topic><topic>Signal processing algorithms</topic><topic>Singular value decomposition</topic><topic>Sorting algorithms</topic><topic>Symmetric matrices</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shahshahani, Seyed Mohamad Reza</creatorcontrib><creatorcontrib>Mahdiani, Hamid Reza</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shahshahani, Seyed Mohamad Reza</au><au>Mahdiani, Hamid Reza</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network</atitle><jtitle>IEEE transactions on circuits and systems. I, Regular papers</jtitle><stitle>TCSI</stitle><date>2020-06-01</date><risdate>2020</risdate><volume>67</volume><issue>6</issue><spage>1912</spage><epage>1924</epage><pages>1912-1924</pages><issn>1549-8328</issn><eissn>1558-0806</eissn><coden>ITCSCH</coden><abstract>Eigenvalue Decomposition (EVD) and Singular Value Decomposition (SVD) are two crucial transformations in many signal processing applications. The main drawback of these algorithms is their computationally intensive nature which prevents them to be efficiently exploited in high-performance, real-time and mobile applications. By extracting the inherent parallelism of the Jacobi SVD, a new parallel data distribution and access pattern for this algorithm is proposed first. Based on the proposed parallel data distribution, a novel shared-memory architecture is then proposed to support EVD/SVD computation in a high-performance and scalable manner. A new Multistage Interconnection Network based on Batcher's odd-even merge sorting network is developed and exploited in the architecture to preserve its performance and scalability by simultaneously connecting different numbers of processing elements to the system memory hierarchy in a parallel conflict-free manner. The proposed architecture can be configured to compute EVD/SVD of matrices of arbitrary size, with different numbers of processing elements achieving a linear speed-up. The synthesis results in a 90 nm technology show that the system with one, two, and four processing elements achieves a throughput of 1.81, 3.63, and 7.26 million EVD/SVD's per second, respectively with a frequency of 813 MHz for an <inline-formula> <tex-math notation="LaTeX">8\times 8 </tex-math></inline-formula> matrix.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSI.2020.2973249</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-7237-4234</orcidid><orcidid>https://orcid.org/0000-0002-6840-1033</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1549-8328 |
ispartof | IEEE transactions on circuits and systems. I, Regular papers, 2020-06, Vol.67 (6), p.1912-1924 |
issn | 1549-8328 1558-0806 |
language | eng |
recordid | cdi_proquest_journals_2408657618 |
source | IEEE Electronic Library (IEL) |
subjects | Applications programs ASIC Batcher’s sorting network brain-computer interface Computer architecture Computer memory Eigenvalues independent component analysis Jacobi EVD/SVD Jacobian matrices Microprocessors Mobile computing motor imagery multi-stage interconnection network Parallel processing Principal component analysis Scalability shared-memory architecture Signal processing Signal processing algorithms Singular value decomposition Sorting algorithms Symmetric matrices |
title | A High-Performance Scalable Shared-Memory SVD Processor Architecture Based on Jacobi Algorithm and Batcher's Sorting Network |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A07%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20High-Performance%20Scalable%20Shared-Memory%20SVD%20Processor%20Architecture%20Based%20on%20Jacobi%20Algorithm%20and%20Batcher's%20Sorting%20Network&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems.%20I,%20Regular%20papers&rft.au=Shahshahani,%20Seyed%20Mohamad%20Reza&rft.date=2020-06-01&rft.volume=67&rft.issue=6&rft.spage=1912&rft.epage=1924&rft.pages=1912-1924&rft.issn=1549-8328&rft.eissn=1558-0806&rft.coden=ITCSCH&rft_id=info:doi/10.1109/TCSI.2020.2973249&rft_dat=%3Cproquest_RIE%3E2408657618%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2408657618&rft_id=info:pmid/&rft_ieee_id=9007016&rfr_iscdi=true |