WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
As large language models (LLMs) continue to advance, aligning these models with human preferences has emerged as a critical challenge. Traditional alignment methods, relying on human or LLM annotated datasets, are limited by their resource-intensive nature, inherent subjectivity, and the risk of fee...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As large language models (LLMs) continue to advance, aligning these models
with human preferences has emerged as a critical challenge. Traditional
alignment methods, relying on human or LLM annotated datasets, are limited by
their resource-intensive nature, inherent subjectivity, and the risk of
feedback loops that amplify model biases. To overcome these limitations, we
introduce WildFeedback, a novel framework that leverages real-time, in-situ
user interactions to create preference datasets that more accurately reflect
authentic human values. WildFeedback operates through a three-step process:
feedback signal identification, preference data construction, and user-guided
evaluation. We applied this framework to a large corpus of user-LLM
conversations, resulting in a rich preference dataset that reflects genuine
user preferences. This dataset captures the nuances of user preferences by
identifying and classifying feedback signals within natural conversations,
thereby enabling the construction of more representative and context-sensitive
alignment data. Our extensive experiments demonstrate that LLMs fine-tuned on
WildFeedback exhibit significantly improved alignment with user preferences, as
evidenced by both traditional benchmarks and our proposed user-guided
evaluation. By incorporating real-time feedback from actual users, WildFeedback
addresses the scalability, subjectivity, and bias challenges that plague
existing approaches, marking a significant step toward developing LLMs that are
more responsive to the diverse and evolving needs of their users. In summary,
WildFeedback offers a robust, scalable solution for aligning LLMs with true
human values, setting a new standard for the development and evaluation of
user-centric language models. |
---|---|
DOI: | 10.48550/arxiv.2408.15549 |