All Random Features Representations are Equivalent
Random features are a powerful technique for rewriting positive-definite kernels as linear products. They bring linear tools to bear in important nonlinear domains like KNNs and attention. Unfortunately, practical implementations require approximating an expectation, usually via sampling. This has l...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Random features are a powerful technique for rewriting positive-definite
kernels as linear products. They bring linear tools to bear in important
nonlinear domains like KNNs and attention. Unfortunately, practical
implementations require approximating an expectation, usually via sampling.
This has led to the development of increasingly elaborate representations with
ever lower sample error. We resolve this arms race by deriving an optimal
sampling policy. Under this policy all random features representations have the
same approximation error, which we show is the lowest possible. This means that
we are free to choose whatever representation we please, provided we sample
optimally. |
---|---|
DOI: | 10.48550/arxiv.2406.18802 |