Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

We introduce Llama Guard, an LLM-based input-output safeguard model geared towards Human-AI conversation use cases. Our model incorporates a safety risk taxonomy, a valuable tool for categorizing a specific set of safety risks found in LLM prompts (i.e., prompt classification). This taxonomy is also...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Inan, Hakan, Upasani, Kartikeya, Chi, Jianfeng, Rungta, Rashi, Iyer, Krithika, Mao, Yuning, Tontchev, Michael, Hu, Qing, Fuller, Brian, Testuggine, Davide, Khabsa, Madian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!