Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference
As the application of deep learning continues to grow, so does the amount of data used to make predictions. While traditionally, big-data deep learning was constrained by computing performance and off-chip memory bandwidth, a new constraint has emerged: privacy. One solution is homomorphic encryptio...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As the application of deep learning continues to grow, so does the amount of
data used to make predictions. While traditionally, big-data deep learning was
constrained by computing performance and off-chip memory bandwidth, a new
constraint has emerged: privacy. One solution is homomorphic encryption (HE).
Applying HE to the client-cloud model allows cloud services to perform
inference directly on the client's encrypted data. While HE can meet privacy
constraints, it introduces enormous computational challenges and remains
impractically slow in current systems.
This paper introduces Cheetah, a set of algorithmic and hardware
optimizations for HE DNN inference to achieve plaintext DNN inference speeds.
Cheetah proposes HE-parameter tuning optimization and operator scheduling
optimizations, which together deliver 79x speedup over the state-of-the-art.
However, this still falls short of plaintext inference speeds by almost four
orders of magnitude. To bridge the remaining performance gap, Cheetah further
proposes an accelerator architecture that, when combined with the algorithmic
optimizations, approaches plaintext DNN inference speeds. We evaluate several
common neural network models (e.g., ResNet50, VGG16, and AlexNet) and show that
plaintext-level HE inference for each is feasible with a custom accelerator
consuming 30W and 545mm^2. |
---|---|
DOI: | 10.48550/arxiv.2006.00505 |