Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs
There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel languages frameworks like OpenCL requires considerable effor...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | There is a large body of legacy scientific code written in languages like
Fortran that is not optimised to get the best performance out of heterogeneous
acceleration devices like GPUs and FPGAs, and manually porting such code into
parallel languages frameworks like OpenCL requires considerable effort. We are
working towards developing a turn-key, self-optimising compiler for
accelerating scientific applications, that can automatically transform legacy
code into a solution for heterogeneous targets. In this paper we focus on FPGAs
as the acceleration devices, and carry out our discussion in the context of the
OpenCL programming framework. We show a route to automatic creation of kernels
which are optimised for execution in a "streaming" fashion, which gives optimal
performance on FPGAs. We use a 2D shallow-water model as an illustration;
specifically we show how the use of \emph{channels} to communicate directly
between peer kernels and the use of on-chip memory to create stencil buffers
can lead to significant performance improvements. Our results show better FPGA
performance against a baseline CPU implementation, and better energy-efficiency
against both CPU and GPU implementations. |
---|---|
DOI: | 10.48550/arxiv.1901.00416 |