Discussion on Hedging Predictions in Machine Learning By A. Gammerman, V. Vovk

A large variety of machine-learning algorithms are now developed and applied in different areas of science and industry. This new technique has a typical drawback, that there is no confidence measure for the prediction of output value for particular new objects. The main idea of the article is to lo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2007-01, Vol.50 (2), p.164-172
1. Verfasser: Chervonenkis, Alexey
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A large variety of machine-learning algorithms are now developed and applied in different areas of science and industry. This new technique has a typical drawback, that there is no confidence measure for the prediction of output value for particular new objects. The main idea of the article is to look over all possible labeling of a new object and evaluate strangeness of each labeling in comparison with labeling of objects presented in the training set. The problem is to find appropriate measure of strangeness. Initially, authors try to apply the ideas of Kolmogogov complexity to estimate the strangeness of labeling. But firstly this complexity is not computable, then it is defined up to a constant factor and finally it is applied to the total sequence of objects, but not to one particular object. So the authors came to another idea (still induced by Kolmogorov complexity). Based on particular machine-learning algorithm it is possible to find reasonable measure of an object (with its labeling) strangeness. For regression (or ridge regression) it could be absolute difference between regression result and real output value: the larger the difference, the more strange is the object. In SVM approach to pattern recognition it could be weights of support vector: the larger is the weight of a vector, the more doubtful seems its labeling, and similar measures of strangeness may be proposed for other algorithms. So the protocol is as follows: look through all possible labeling of a new object. For each labeling add the object to the training set. Apply the machine-learning algorithm and rank the objects by their measure of strangeness. Estimate credibility of this labeling as one minus the ratio of the number of objects in the set more strange than the new one to the total number of objects in the set. This approach seems to be new and powerful. Its main advantage is that it is non-parametric and based only on i.i.d. assumption. In comparison with Bayesian approach, no prior distribution is used. The main theoretical result is the proof of validity of proposed conformal predictors. It means that in average, conformal predictors never over-rate the accuracy and reliability of their predictions. The second result is that asymptotically the relative number of cases when the real output value is within confidence interval converges to the average value of conformal predictors. Software implementing the proposed technique is now applied to a large variety of practical p
ISSN:0010-4620
0010-4620
DOI:10.1093/comjnl/bx1066