Vidur: A Large-Scale Simulation Framework For LLM Inference

Optimizing the deployment of Large language models (LLMs) is expensive today since it requires experimentally running an application workload against an LLM implementation while exploring large configuration space formed by system knobs such as parallelization strategies, batching techniques, and sc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Agrawal, Amey, Kedia, Nitin, Mohan, Jayashree, Panwar, Ashish, Kwatra, Nipun, Gulavani, Bhargav, Ramjee, Ramachandran, Tumanov, Alexey
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!