Accelerated MM Algorithms for Ranking Scores Inference from Comparison Data

In this paper, we study a popular method for inference of the Bradley-Terry model parameters, namely the MM algorithm, for maximum likelihood estimation and maximum a posteriori probability estimation. This class of models includes the Bradley-Terry model of paired comparisons, the Rao-Kupper model...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vojnovic, Milan, Yun, Seyoung, Zhou, Kaifang
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Mathematics - Optimization and Control Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we study a popular method for inference of the Bradley-Terry model parameters, namely the MM algorithm, for maximum likelihood estimation and maximum a posteriori probability estimation. This class of models includes the Bradley-Terry model of paired comparisons, the Rao-Kupper model of paired comparisons allowing for tie outcomes, the Luce choice model, and the Plackett-Luce ranking model. We establish tight characterizations of the convergence rate for the MM algorithm, and show that it is essentially equivalent to that of a gradient descent algorithm. For the maximum likelihood estimation, the convergence is shown to be linear with the rate crucially determined by the algebraic connectivity of the matrix of item pair co-occurrences in observed comparison data. For the Bayesian inference, the convergence rate is also shown to be linear, with the rate determined by a parameter of the prior distribution in a way that can make the convergence arbitrarily slow for small values of this parameter. We propose a simple modification of the classical MM algorithm that avoids the observed slow convergence issue and accelerates the convergence. The key component of the accelerated MM algorithm is a parameter rescaling performed at each iteration step that is carefully chosen based on theoretical analysis and characterisation of the convergence rate. Our experimental results, performed on both synthetic and real-world data, demonstrate the identified slow convergence issue of the classic MM algorithm, and show that significant efficiency gains can be obtained by our new proposed method.
DOI:	10.48550/arxiv.1901.00150