CUDA and OpenCL implementations of 3D Fast Wavelet Transform
We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on CUDA and OpenCL running on a new Fermi Tesla architecture. We evaluate these proposals and make a comparison with others optimal executed on multicores CPU and Nvidia Tesla C870. Speedups of the CUDA versio...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on CUDA and OpenCL running on a new Fermi Tesla architecture. We evaluate these proposals and make a comparison with others optimal executed on multicores CPU and Nvidia Tesla C870. Speedups of the CUDA version on Fermi architecture are the best results, improving the execution times on CPU, ranging from 5.3× to 7.4× for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2× factors on small frame sizes to 3× factors on larger ones. |
---|---|
DOI: | 10.1109/LASCAS.2012.6180318 |