A efficient parallel deblocking filter based on GPU: Implementation and optimization
The deblocking filter represents one of the most time consuming tasks of the H.264/AVC standard. Due to its characteristics of data dependencies and frequent memory access, it poses an arduous challenge to mapping the algorithm onto massively parallel architecture efficiently. In this paper, a novel...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The deblocking filter represents one of the most time consuming tasks of the H.264/AVC standard. Due to its characteristics of data dependencies and frequent memory access, it poses an arduous challenge to mapping the algorithm onto massively parallel architecture efficiently. In this paper, a novel parallel deblocking filter is proposed based on GPU, which weaken the dependencies between MBs by rearrange the filter orders of boundaries. We implemented the proposed algorithm on GPU and optimized the program through three strategies, including kernel combination, reusing the intermediate data and optimizing data representation. Experimental results show that applying the proposed parallel method supports real-time processing throughput for 1080p at 450fps. We have also observed 3.78× and 16.68× speedup for comprehensive optimization parallel deblocking filter on two-core processor and the state-of-the-art GPU-based implementation, respectively. |
---|---|
ISSN: | 1555-5798 2154-5952 |
DOI: | 10.1109/PACRIM.2011.6032906 |