Scalable High-Quality Hypergraph Partitioning
Balanced hypergraph partitioning is an NP-hard problem with many applications, e.g., optimizing communication in distributed data placement problems. The goal is to place all nodes across $k$ different blocks of bounded size, such that hyperedges span as few parts as possible. This problem is well-s...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Balanced hypergraph partitioning is an NP-hard problem with many
applications, e.g., optimizing communication in distributed data placement
problems. The goal is to place all nodes across $k$ different blocks of bounded
size, such that hyperedges span as few parts as possible. This problem is
well-studied in sequential and distributed settings, but not in shared-memory.
We close this gap by devising efficient and scalable shared-memory algorithms
for all components employed in the best sequential solvers without compromises
with regards to solution quality.
This work presents the scalable and high-quality hypergraph partitioning
framework Mt-KaHyPar. Its most important components are parallel improvement
algorithms based on the FM algorithm and maximum flows, as well as a parallel
clustering algorithm for coarsening - which are used in a multilevel scheme
with $\log(n)$ levels. As additional components, we parallelize the $n$-level
partitioning scheme, devise a deterministic version of our algorithm, and
present optimizations for plain graphs.
We evaluate our solver on more than 800 graphs and hypergraphs, and compare
it with 25 different algorithms from the literature. Our fastest configuration
outperforms almost all existing hypergraph partitioners with regards to both
solution quality and running time. Our highest-quality configuration achieves
the same solution quality as the best sequential partitioner KaHyPar, while
being an order of magnitude faster with ten threads. Thus, two of our
configurations occupy all fronts of the Pareto curve for hypergraph
partitioning. Furthermore, our solvers exhibit good speedups, e.g., 29.6x in
the geometric mean on 64 cores (deterministic), 22.3x ($\log(n)$-level), and
25.9x ($n$-level). |
---|---|
DOI: | 10.48550/arxiv.2303.17679 |