Language models are weak learners
A central notion in practical and theoretical machine learning is that of a $\textit{weak learner}$, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning metho...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A central notion in practical and theoretical machine learning is that of a
$\textit{weak learner}$, classifiers that achieve better-than-random
performance (on any given distribution over data), even by a small margin. Such
weak learners form the practical basis for canonical machine learning methods
such as boosting. In this work, we illustrate that prompt-based large language
models can operate effectively as said weak learners. Specifically, we
illustrate the use of a large language model (LLM) as a weak learner in a
boosting algorithm applied to tabular data. We show that by providing (properly
sampled according to the distribution of interest) text descriptions of tabular
data samples, LLMs can produce a summary of the samples that serves as a
template for classification and achieves the aim of acting as a weak learner on
this task. We incorporate these models into a boosting approach, which in some
settings can leverage the knowledge within the LLM to outperform traditional
tree-based boosting. The model outperforms both few-shot learning and
occasionally even more involved fine-tuning procedures, particularly for tasks
involving small numbers of data points. The results illustrate the potential
for prompt-based LLMs to function not just as few-shot learners themselves, but
as components of larger machine learning pipelines. |
---|---|
DOI: | 10.48550/arxiv.2306.14101 |