An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants
We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low lat...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present an approach to build Large Language Model (LLM) based slot-filling
system to perform Dialogue State Tracking in conversational assistants serving
across a wide variety of industry-grade applications. Key requirements of this
system include: 1) usage of smaller-sized models to meet low latency
requirements and to enable convenient and cost-effective cloud and customer
premise deployments, and 2) zero-shot capabilities to serve across a wide
variety of domains, slot types and conversational scenarios. We adopt a
fine-tuning approach where a pre-trained LLM is fine-tuned into a slot-filling
model using task specific data. The fine-tuning data is prepared carefully to
cover a wide variety of slot-filling task scenarios that the model is expected
to face across various domains. We give details of the data preparation and
model building process. We also give a detailed analysis of the results of our
experimental evaluations. Results show that our prescribed approach for
slot-filling model building has resulted in 6.9% relative improvement of F1
metric over the best baseline on a realistic benchmark, while at the same time
reducing the latency by 57%. More over, the data we prepared has helped improve
F1 on an average by 4.2% relative across various slot-types. |
---|---|
DOI: | 10.48550/arxiv.2406.08848 |