Designing and building application‐centric parallel memories

Summary Memory bandwidth is a critical performance factor for many applications and architectures. Intuitively, a parallel memory could be a good solution for any bandwidth‐limited application, yet building application‐centric custom parallel memories remains a challenge. In this work, we present a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Concurrency and computation 2020-08, Vol.32 (15), p.n/a
Hauptverfasser:	Stramondo, Giulio, Ciobanu, Cătălin Bogdan, Laat, Cees, Varbanescu, Ana Lucia
Format:	Artikel
Sprache:	eng
Schlagworte:	FPGA high‐bandwidth parallel memories memory access patterns Performance prediction Prediction models STREAM benchmark
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	n/a
container_issue	15
container_start_page
container_title	Concurrency and computation
container_volume	32
creator	Stramondo, Giulio Ciobanu, Cătălin Bogdan Laat, Cees Varbanescu, Ana Lucia
description	Summary Memory bandwidth is a critical performance factor for many applications and architectures. Intuitively, a parallel memory could be a good solution for any bandwidth‐limited application, yet building application‐centric custom parallel memories remains a challenge. In this work, we present a comprehensive approach to tackle this challenge and demonstrate how to systematically design and implement application‐centric parallel memories. Specifically, our approach (1) analyzes the application memory access traces to extract parallel accesses, (2) configures our parallel memory for maximum performance, and (3) builds the actual application‐centric memory system. We further provide a simple performance prediction model for the constructed memory system. We evaluate our approach with two sets of experiments. First, we demonstrate how our parallel memories provide performance benefits for a broad range of memory access patterns. Second, we prove the feasibility of our approach and validate our performance model by implementing and benchmarking the designed parallel memories using FPGA hardware and a sparse version of the STREAM benchmark.
doi_str_mv	10.1002/cpe.5485
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2420666792</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2420666792</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3275-73348953274d794a273651afb688024bce35f2d14cb499d140d1849714c350243</originalsourceid><addsrcrecordid>eNp1kMtKxDAUhoMoOI6Cj1Bw46Zj7mk3gtTxAgO60HVI03TIkF5MWmR2PsI8o09iZiruXP3nHD7OOXwAXCK4QBDiG92bBaMZOwIzxAhOISf0-K_G_BSchbCBECFI0Azc3ptg161t14lqq6QcrasOTd87q9Vgu_b7a6dNO3irk1555ZxxSWOazlsTzsFJrVwwF785B-8Py7fiKV29PD4Xd6tUEyxYKgihWR5fELQSOVVYEM6QqkueZRDTUhvCalwhqkua5zFhhTKaizggLAJkDq6mvb3vPkYTBrnpRt_GkxJTDDnnIseRup4o7bsQvKll722j_FYiKPd2ZLQj93Yimk7op3Vm-y8ni9flgf8BQGNkfg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2420666792</pqid></control><display><type>article</type><title>Designing and building application‐centric parallel memories</title><source>Access via Wiley Online Library</source><creator>Stramondo, Giulio ; Ciobanu, Cătălin Bogdan ; Laat, Cees ; Varbanescu, Ana Lucia</creator><creatorcontrib>Stramondo, Giulio ; Ciobanu, Cătălin Bogdan ; Laat, Cees ; Varbanescu, Ana Lucia</creatorcontrib><description>Summary Memory bandwidth is a critical performance factor for many applications and architectures. Intuitively, a parallel memory could be a good solution for any bandwidth‐limited application, yet building application‐centric custom parallel memories remains a challenge. In this work, we present a comprehensive approach to tackle this challenge and demonstrate how to systematically design and implement application‐centric parallel memories. Specifically, our approach (1) analyzes the application memory access traces to extract parallel accesses, (2) configures our parallel memory for maximum performance, and (3) builds the actual application‐centric memory system. We further provide a simple performance prediction model for the constructed memory system. We evaluate our approach with two sets of experiments. First, we demonstrate how our parallel memories provide performance benefits for a broad range of memory access patterns. Second, we prove the feasibility of our approach and validate our performance model by implementing and benchmarking the designed parallel memories using FPGA hardware and a sparse version of the STREAM benchmark.</description><identifier>ISSN: 1532-0626</identifier><identifier>EISSN: 1532-0634</identifier><identifier>DOI: 10.1002/cpe.5485</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc</publisher><subject>FPGA ; high‐bandwidth parallel memories ; memory access patterns ; Performance prediction ; Prediction models ; STREAM benchmark</subject><ispartof>Concurrency and computation, 2020-08, Vol.32 (15), p.n/a</ispartof><rights>2019 The Authors. Published by John Wiley & Sons Ltd.</rights><rights>2019. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3275-73348953274d794a273651afb688024bce35f2d14cb499d140d1849714c350243</citedby><cites>FETCH-LOGICAL-c3275-73348953274d794a273651afb688024bce35f2d14cb499d140d1849714c350243</cites><orcidid>0000-0002-3124-189X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fcpe.5485$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fcpe.5485$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids></links><search><creatorcontrib>Stramondo, Giulio</creatorcontrib><creatorcontrib>Ciobanu, Cătălin Bogdan</creatorcontrib><creatorcontrib>Laat, Cees</creatorcontrib><creatorcontrib>Varbanescu, Ana Lucia</creatorcontrib><title>Designing and building application‐centric parallel memories</title><title>Concurrency and computation</title><description>Summary Memory bandwidth is a critical performance factor for many applications and architectures. Intuitively, a parallel memory could be a good solution for any bandwidth‐limited application, yet building application‐centric custom parallel memories remains a challenge. In this work, we present a comprehensive approach to tackle this challenge and demonstrate how to systematically design and implement application‐centric parallel memories. Specifically, our approach (1) analyzes the application memory access traces to extract parallel accesses, (2) configures our parallel memory for maximum performance, and (3) builds the actual application‐centric memory system. We further provide a simple performance prediction model for the constructed memory system. We evaluate our approach with two sets of experiments. First, we demonstrate how our parallel memories provide performance benefits for a broad range of memory access patterns. Second, we prove the feasibility of our approach and validate our performance model by implementing and benchmarking the designed parallel memories using FPGA hardware and a sparse version of the STREAM benchmark.</description><subject>FPGA</subject><subject>high‐bandwidth parallel memories</subject><subject>memory access patterns</subject><subject>Performance prediction</subject><subject>Prediction models</subject><subject>STREAM benchmark</subject><issn>1532-0626</issn><issn>1532-0634</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>WIN</sourceid><recordid>eNp1kMtKxDAUhoMoOI6Cj1Bw46Zj7mk3gtTxAgO60HVI03TIkF5MWmR2PsI8o09iZiruXP3nHD7OOXwAXCK4QBDiG92bBaMZOwIzxAhOISf0-K_G_BSchbCBECFI0Azc3ptg161t14lqq6QcrasOTd87q9Vgu_b7a6dNO3irk1555ZxxSWOazlsTzsFJrVwwF785B-8Py7fiKV29PD4Xd6tUEyxYKgihWR5fELQSOVVYEM6QqkueZRDTUhvCalwhqkua5zFhhTKaizggLAJkDq6mvb3vPkYTBrnpRt_GkxJTDDnnIseRup4o7bsQvKll722j_FYiKPd2ZLQj93Yimk7op3Vm-y8ni9flgf8BQGNkfg</recordid><startdate>20200810</startdate><enddate>20200810</enddate><creator>Stramondo, Giulio</creator><creator>Ciobanu, Cătălin Bogdan</creator><creator>Laat, Cees</creator><creator>Varbanescu, Ana Lucia</creator><general>Wiley Subscription Services, Inc</general><scope>24P</scope><scope>WIN</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3124-189X</orcidid></search><sort><creationdate>20200810</creationdate><title>Designing and building application‐centric parallel memories</title><author>Stramondo, Giulio ; Ciobanu, Cătălin Bogdan ; Laat, Cees ; Varbanescu, Ana Lucia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3275-73348953274d794a273651afb688024bce35f2d14cb499d140d1849714c350243</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>FPGA</topic><topic>high‐bandwidth parallel memories</topic><topic>memory access patterns</topic><topic>Performance prediction</topic><topic>Prediction models</topic><topic>STREAM benchmark</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stramondo, Giulio</creatorcontrib><creatorcontrib>Ciobanu, Cătălin Bogdan</creatorcontrib><creatorcontrib>Laat, Cees</creatorcontrib><creatorcontrib>Varbanescu, Ana Lucia</creatorcontrib><collection>Wiley Online Library (Open Access Collection)</collection><collection>Wiley Online Library (Open Access Collection)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Concurrency and computation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Stramondo, Giulio</au><au>Ciobanu, Cătălin Bogdan</au><au>Laat, Cees</au><au>Varbanescu, Ana Lucia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Designing and building application‐centric parallel memories</atitle><jtitle>Concurrency and computation</jtitle><date>2020-08-10</date><risdate>2020</risdate><volume>32</volume><issue>15</issue><epage>n/a</epage><issn>1532-0626</issn><eissn>1532-0634</eissn><abstract>Summary Memory bandwidth is a critical performance factor for many applications and architectures. Intuitively, a parallel memory could be a good solution for any bandwidth‐limited application, yet building application‐centric custom parallel memories remains a challenge. In this work, we present a comprehensive approach to tackle this challenge and demonstrate how to systematically design and implement application‐centric parallel memories. Specifically, our approach (1) analyzes the application memory access traces to extract parallel accesses, (2) configures our parallel memory for maximum performance, and (3) builds the actual application‐centric memory system. We further provide a simple performance prediction model for the constructed memory system. We evaluate our approach with two sets of experiments. First, we demonstrate how our parallel memories provide performance benefits for a broad range of memory access patterns. Second, we prove the feasibility of our approach and validate our performance model by implementing and benchmarking the designed parallel memories using FPGA hardware and a sparse version of the STREAM benchmark.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/cpe.5485</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-3124-189X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1532-0626
ispartof	Concurrency and computation, 2020-08, Vol.32 (15), p.n/a
issn	1532-0626 1532-0634
language	eng
recordid	cdi_proquest_journals_2420666792
source	Access via Wiley Online Library
subjects	FPGA high‐bandwidth parallel memories memory access patterns Performance prediction Prediction models STREAM benchmark
title	Designing and building application‐centric parallel memories
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T23%3A41%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Designing%20and%20building%20application%E2%80%90centric%20parallel%20memories&rft.jtitle=Concurrency%20and%20computation&rft.au=Stramondo,%20Giulio&rft.date=2020-08-10&rft.volume=32&rft.issue=15&rft.epage=n/a&rft.issn=1532-0626&rft.eissn=1532-0634&rft_id=info:doi/10.1002/cpe.5485&rft_dat=%3Cproquest_cross%3E2420666792%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2420666792&rft_id=info:pmid/&rfr_iscdi=true