thSORT: an efficient parallel sorting algorithm on multi-core DSPs

Multi-core architecture has become the main trend in high performance computing (HPC) because of its powerful parallel computing capability. Due to energy efficiency constraints, energy-efficient multi-core digital signal processors (DSPs) have become an alternative architecture in HPC systems. FT-M...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:CCF transactions on high performance computing (Online) 2024-10, Vol.6 (5), p.503-518
Hauptverfasser: Yang, Mouzhi, Zhang, Peng, Fang, Jianbin, Liu, Weifeng, Huang, Chun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi-core architecture has become the main trend in high performance computing (HPC) because of its powerful parallel computing capability. Due to energy efficiency constraints, energy-efficient multi-core digital signal processors (DSPs) have become an alternative architecture in HPC systems. FT-M7032 is a CPU-DSP heterogeneous processor that integrates 16 CPU cores for running operating systems and four multi-core general purpose DSP (GPDSP) clusters for providing high performance. Sorting is a fundamental operation in computer science with numerous applications and has been studied extensively, but high-performance parallel sorting algorithms are typically architecture-specific. To our knowledge, little attention has been paid to optimizing the sorting on the low-power multicore DSPs. In this paper, we propose thSORT, an efficient bitonic sorting algorithm for FT-M7032. Our algorithm consists of two parts: single-core DSP sorting and multi-core DSP sorting, both aiming to tap the features of FT-M7032. We implement a vector micro-kernel for bitonic sort and propose a multi-level algorithm to merge the results of the micro-kernel. When compared to the CPU baseline, our implementation is 1.43 × faster than the parallel sorting of the Boost C++ Libraries, and is 2.15 × faster than std::sort.
ISSN:2524-4922
2524-4930
DOI:10.1007/s42514-023-00175-7