Error Bounds and Improved Probability Estimation using the Maximum Likelihood Set

The maximum likelihood set (MLS) is a novel candidate for nonparametric probability estimation from small samples that permits incorporating prior or structural knowledge into the estimator. It is a set of probability distributions which assign to the observed type (or empirical distribution) a like...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Karakos, D., Khudanpur, S.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Bayesian methods Entropy Estimation error High level synthesis Maximum likelihood estimation Multilevel systems Natural languages Probability distribution Smoothing methods Speech processing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The maximum likelihood set (MLS) is a novel candidate for nonparametric probability estimation from small samples that permits incorporating prior or structural knowledge into the estimator. It is a set of probability distributions which assign to the observed type (or empirical distribution) a likelihood that is no lower than the likelihood they assign to any other type. The MLS has been shown to have many highly desirable properties, including strong consistency of MLS-based estimates; yet the probability that the MLS contains the data-generating distribution may be arbitrarily small. In this paper, we propose to overcome this shortcoming via an epsiv-fattening of the MLS. The proposed set, called the High Likelihood Set (HLS), with epsiv rarr 0 slowly in sample size, ensures that the HLS contains the data- generating distribution with arbitrarily large probability, while retaining most desirable properties of the MLS. In particular, the HLS provides a "high-probability" bound on the estimation error, and experimental results in statistical language modeling show improved operational performance from HLS-based estimates.
ISSN:	2157-8095 2157-8117
DOI:	10.1109/ISIT.2007.4557150