Direct and indirect algorithms for on-line learning of disjunctions

It is easy to design on-line learning algorithms for learning k out of n variable monotone disjunctions by simply keeping one weight per disjunction. Such algorithms use roughly O( n k ) weights which can be prohibitively expensive. Surprisingly, algorithms like Winnow require only n weights (one pe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Theoretical computer science 2002-07, Vol.284 (1), p.109-142
Hauptverfasser: Helmbold, D.P, Panizza, S, Warmuth, M.K
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:It is easy to design on-line learning algorithms for learning k out of n variable monotone disjunctions by simply keeping one weight per disjunction. Such algorithms use roughly O( n k ) weights which can be prohibitively expensive. Surprisingly, algorithms like Winnow require only n weights (one per variable or attribute) and the mistake bound of these algorithms is not too much worse than the mistake bound of the more costly algorithms. The purpose of this paper is to investigate how exponentially many weights can be collapsed into only O( n) weights. In particular, we consider probabilistic assumptions that enable the Bayes optimal algorithm's posterior over the disjunctions to be encoded with only O( n) weights. This results in a new O( n) algorithm for learning disjunctions which is related to the Bylander's BEG algorithm originally introduced for linear regression. Besides providing a Bayesian interpretation for this new algorithm, we are also able to obtain mistake bounds for the noise free case resembling those that have been derived for the Winnow algorithm. The same techniques used to derive this new algorithm also provide a Bayesian interpretation for a normalized version of Winnow.
ISSN:0304-3975
1879-2294
DOI:10.1016/S0304-3975(01)00081-0