PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption

Confidential computing on GPUs, like NVIDIA H100, mitigates the security risks of outsourced Large Language Models (LLMs) by implementing strong isolation and data encryption. Nonetheless, this encryption incurs a significant performance overhead, reaching up to 52.8 percent and 88.2 percent through...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Tan, Yifan, Tan, Cheng, Mi, Zeyu, Chen, Haibo
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Cryptography and Security Computer Science - Distributed, Parallel, and Cluster Computing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!