Data-driven team ranking and match performance analysis in Chinese Football Super League
•Data-driven team-rank framework displayed consolidated evaluation of football team performance through a multi-dimensional approach.•The use of machine learning techniques displays great applicability into rating team performance and predicting league ranking.•Defensive ability and shooting accurac...
Gespeichert in:
Veröffentlicht in: | Chaos, solitons and fractals solitons and fractals, 2020-12, Vol.141, p.110330, Article 110330 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Data-driven team-rank framework displayed consolidated evaluation of football team performance through a multi-dimensional approach.•The use of machine learning techniques displays great applicability into rating team performance and predicting league ranking.•Defensive ability and shooting accuracy are most important performance aspects distinguishing Chinese Football Super League winning and losing team, and highly-ranked teams maintain more constant performance than their peer teams.
Recent years have seen an increasing body of research into the evaluation of the team-level technical-tactical performance in association football using match events data. However, most studies used mono-dimensional approach and modeled the influence of each performance aspects on match result in isolation, which limited the interpretability of the results. The study was aimed to apply a state-of-the-art algorithm to the ranking of team performance and exploitation of key performance features in relation to match outcome based on massive match dataset. Data of all 1200 matches from 2014 to 2018 Chinese Football Super League (CSL) were used. From the original 164 match events, we extracted 22 features that were related to attacking, passing, and defending performance and most. A Linear Support Vector Classifier (LSVC) model was subsequently built with these 22 input features and trained in order to rank the teams by their performance and analyze the features that influence most match outcome (win/not win), with the dataset being divided into a ratio of 4:1 to train and validate the model. The results have shown that the data-driven LSVC model displayed a prediction accuracy of 0.83 and the ranking of teams’ match performance and prediction of teams’ league standings were highly correlated with their actual ranking. Saves, pass success and shot on target in penalty area were demonstrated as top positive features for winning whereas shots on target during open play, pass and bad shot% were three negative features most influential for the match result. The team ranks of all teams were highly correlated with their real final league rankings. In general, CSL winning teams build their success based on defensive ability and shooting accuracy, and high-ranked teams could always maintain better performance than their counterparts. The team-rank framework could provide a consolidated and complex approach to evaluate the match performance quality of the teams, refining decisions-makin |
---|---|
ISSN: | 0960-0779 1873-2887 |
DOI: | 10.1016/j.chaos.2020.110330 |