Design and Scheduling of an AI-based Queueing System
To leverage prediction models to make optimal scheduling decisions in service systems, we must understand how predictive errors impact congestion due to externalities on the delay of other jobs. Motivated by applications where prediction models interact with human servers (e.g., content moderation),...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To leverage prediction models to make optimal scheduling decisions in service
systems, we must understand how predictive errors impact congestion due to
externalities on the delay of other jobs. Motivated by applications where
prediction models interact with human servers (e.g., content moderation), we
consider a large queueing system comprising of many single server queues where
the class of a job is estimated using a prediction model. By characterizing the
impact of mispredictions on congestion cost in heavy traffic, we design an
index-based policy that incorporates the predicted class information in a
near-optimal manner. Our theoretical results guide the design of predictive
models by providing a simple model selection procedure with downstream queueing
performance as a central concern, and offer novel insights on how to design
queueing systems with AI-based triage. We illustrate our framework on a content
moderation task based on real online comments, where we construct toxicity
classifiers by finetuning large language models. |
---|---|
DOI: | 10.48550/arxiv.2406.06855 |