Effective GPU Parallelization of Distributed and Localized Model Predictive Control
To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To effectively control large-scale distributed systems online, model
predictive control (MPC) has to swiftly solve the underlying high-dimensional
optimization. There are multiple techniques applied to accelerate the solving
process in the literature, mainly attributed to software-based algorithmic
advancements and hardware-assisted computation enhancements. However, those
methods focus on arithmetic accelerations and overlook the benefits of the
underlying system's structure. In particular, the existing decoupled
software-hardware algorithm design that naively parallelizes the arithmetic
operations by the hardware does not tackle the hardware overheads such as
CPU-GPU and thread-to-thread communications in a principled manner. Also, the
advantages of parallelizable subproblem decomposition in distributed MPC are
not well recognized and exploited. As a result, we have not reached the full
potential of hardware acceleration for MPC. In this paper, we explore those
opportunities by leveraging GPU to parallelize the distributed and localized
MPC (DLMPC) algorithm. We exploit the locality constraints embedded in the
DLMPC formulation to reduce the hardware-intrinsic communication overheads. Our
parallel implementation achieves up to 50x faster runtime than its CPU
counterparts under various parameters. Furthermore, we find that the
locality-aware GPU parallelization could halve the optimization runtime
comparing to the naive acceleration. Overall, our results demonstrate the
performance gains brought by software-hardware co-design with the information
exchange structure in mind. |
---|---|
DOI: | 10.48550/arxiv.2103.14990 |