Exploring memory synchronization and performance considerations for FPGA platform using the high-abstracted OpenCL framework: Benchmarks development and analysis

A key benefit of the Open Computing Language (OpenCL) software framework is its capability to operate across diverse architectures. Field programmable gate arrays (FPGAs) are a high-speed computing architecture used for computation acceleration. This study investigates the impact of memory access ti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PloS one 2024-05, Vol.19 (5), p.e0301720-e0301720
Hauptverfasser:	Almomany, Abedalmuhdi, Jarrah, Amin, Sutcu, Muhammed
Format:	Artikel
Sprache:	eng
Schlagworte:	Acceleration Access time Benchmarking Benchmarks Biology and Life Sciences Commodities Computation Computer and Information Sciences Computer architecture Computer memory Design Digital integrated circuits Digital signal processors Embedded systems Engineering and Technology Field programmable gate arrays Graphics processing units Humans Mathematical functions Optimization techniques Performance assessment Physical Sciences Platforms Programming Languages Research and Analysis Methods Social Sciences Software Synchronism Synchronization Systems analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A key benefit of the Open Computing Language (OpenCL) software framework is its capability to operate across diverse architectures. Field programmable gate arrays (FPGAs) are a high-speed computing architecture used for computation acceleration. This study investigates the impact of memory access time on overall performance in general FPGA computing environments through the creation of eight benchmarks within the OpenCL framework. The developed benchmarks capture a range of memory access behaviors, and they play a crucial role in assessing the performance of spinning and sleeping on FPGA-based architectures. The results obtained guide the formulation of new implementations and contribute to defining an abstraction of FPGAs. This abstraction is then utilized to create tailored implementations of primitives that are well-suited for this platform. While other research endeavors concentrate on creating benchmarks with the Compute Unified Device Architecture (CUDA) to scrutinize the memory systems across diverse GPU architectures and propose recommendations for future generations of GPU computation platforms, this study delves into the memory system analysis for the broader FPGA computing platform. It achieves this by employing the highly abstracted OpenCL framework, exploring various data workload characteristics, and experimentally delineating the appropriate implementation of primitives that can seamlessly integrate into a design tailored for the FPGA computing platform. Additionally, the results underscore the efficacy of employing a task-parallel model to mitigate the need for high-cost synchronization mechanisms in designs constructed on general FPGA computing platforms.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0301720