A new methodology for generating and combining statistical forecasting models to enhance competitive event prediction

► New methodology for combining model-based forecasts in competitive events (CE). ► Demonstrate that the predominant combination approach (averaging) fails in CE. ► Develop stacking approach with conditional-logit and LLR-based forecast selection. ► Verify the new model’s effectiveness in large-scal...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:European journal of operational research 2012-04, Vol.218 (1), p.163-174
Hauptverfasser: Lessmann, Stefan, Sung, Ming-Chien, Johnson, Johnnie E.V., Ma, Tiejun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:► New methodology for combining model-based forecasts in competitive events (CE). ► Demonstrate that the predominant combination approach (averaging) fails in CE. ► Develop stacking approach with conditional-logit and LLR-based forecast selection. ► Verify the new model’s effectiveness in large-scale empirical study. ► Explain results in terms of the strength/diversity trade-off. Forecasting methods are routinely employed to predict the outcome of competitive events (CEs) and to shed light on the factors that influence participants’ winning prospects (e.g., in sports events, political elections). Combining statistical models’ forecasts, shown to be highly successful in other settings, has been neglected in CE prediction. Two particular difficulties arise when developing model-based composite forecasts of CE outcomes: the intensity of rivalry among contestants, and the strength/diversity trade-off among individual models. To overcome these challenges we propose a range of surrogate measures of event outcome to construct a heterogeneous set of base forecasts. To effectively extract the complementary information concealed within these predictions, we develop a novel pooling mechanism which accounts for competition among contestants: a stacking paradigm integrating conditional logit regression and log-likelihood-ratio-based forecast selection. Empirical results using data related to horseracing events demonstrate that: (i) base model strength and diversity are important when combining model-based predictions for CEs; (ii) average-based pooling, commonly employed elsewhere, may not be appropriate for CEs (because average-based pooling exclusively focuses on strength); and (iii) the proposed stacking ensemble provides statistically and economically accurate forecasts. These results have important implications for regulators of betting markets associated with CEs and in particular for the accurate assessment of market efficiency.
ISSN:0377-2217
1872-6860
DOI:10.1016/j.ejor.2011.10.032