The NIST 2008 Metrics for machine translation challenge—overview, methodology, metrics, and results

This paper discusses the evaluation of automated metrics developed for the purpose of evaluating machine translation (MT) technology. A general discussion of the usefulness of automated metrics is offered. The NIST MetricsMATR evaluation of MT metrology is described, including its objectives, protoc...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Machine translation 2009-09, Vol.23 (2/3), p.71-103
Hauptverfasser:	Przybocki, Mark, Peterson, Kay, Bronsart, Sébastien, Sanders, Gregory
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Computational Linguistics Computer Science Correlations Datasets Error rates Language translation Machine translation Names Natural Language Processing (NLP) Statistics Supernova remnants Test data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper discusses the evaluation of automated metrics developed for the purpose of evaluating machine translation (MT) technology. A general discussion of the usefulness of automated metrics is offered. The NIST MetricsMATR evaluation of MT metrology is described, including its objectives, protocols, participants, and test data. The methodology employed to evaluate the submitted metrics is reviewed. A summary is provided for the general classes of evaluated metrics. Overall results of this evaluation are presented, primarily by means of correlation statistics, showing the degree of agreement between the automated metric scores and the scores of human judgments. Metrics are analyzed at the sentence, document, and system level with results conditioned by various properties of the test data. This paper concludes with some perspective on the improvements that should be incorporated into future evaluations of metrics for MT evaluation.
ISSN:	0922-6567 1573-0573
DOI:	10.1007/s10590-009-9065-6