Playing monotone games to understand learning behaviors

We deal with a special class of games against nature which correspond to subsymbolic learning problems where we know a local descent direction in the error landscape but not the amount gained at each step of the learning procedure. Namely, Alice and Bob play a game where the probability of victory g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Theoretical computer science 2010-05, Vol.411 (25), p.2384-2405
Hauptverfasser: Apolloni, Bruno, Bassis, Simone, Gaito, Sabrina, Malchiodi, Dario, Zoppis, Italo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We deal with a special class of games against nature which correspond to subsymbolic learning problems where we know a local descent direction in the error landscape but not the amount gained at each step of the learning procedure. Namely, Alice and Bob play a game where the probability of victory grows monotonically by unknown amounts with the resources each employs. For a fixed effort on Alice’s part Bob increases his resources on the basis of the results of the individual contests (victory, tie or defeat). Quite unlike the usual ones in game theory, his aim is to stop as soon as the defeat probability goes under a given threshold with high confidence. We adopt such a game policy as an archetypal remedy to the general overtraining threat of learning algorithms. Namely, we deal with the original game in a computational learning framework analogous to the Probably Approximately Correct formulation. Therein, a wise use of a special inferential mechanism (known as twisting argument) highlights relevant statistics for managing different trade-offs between observability and controllability of the defeat probability. With similar statistics we discuss an analogous trade-off at the basis of the stopping criterion of subsymbolic learning procedures. As a conclusion, we propose a principled stopping rule based solely on the behavior of the training session, hence without distracting examples into a test set.
ISSN:0304-3975
1879-2294
DOI:10.1016/j.tcs.2010.02.011