Hockey Pose Estimation and Action Recognition using Convolutional Neural Networks to Ice Hockey

Human pose estimation and action recognition in ice hockey are one of the biggest challenges in computer vision-driven sports analytics, with a variety of difficulties such as bulky hockey wear, color similarity between ice rink and player jersey and the presence of additional sports equipment used...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Neher, Helmut
Format: Dissertation
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Human pose estimation and action recognition in ice hockey are one of the biggest challenges in computer vision-driven sports analytics, with a variety of difficulties such as bulky hockey wear, color similarity between ice rink and player jersey and the presence of additional sports equipment used by the players such as hockey sticks. As such, deep neural network architectures typically used for sports including baseball, soccer, and track and field perform poorly when applied to hockey. This research involves the design and implementation of deep neural networks for both pose estimation and action recognition can effectively evaluate the pose and the actions of a hockey player. First, a pre-trained convolutional neural network, known as the stacked hourglass network, is used to determine a hockey player's body placement in video frames, also known as pose estimation. The proposed method provides a tool to analyze the pose of a hockey player via broadcast video which aids in the eventual assessment of a hockey player's speed, shot accuracy, or other metrics. The algorithm demonstrated to be successful since it identifies on average 81.56% of the joints of a hockey player on a set of test images. Furthermore, inspired by the idea that modeling the pose of a hockey stick can improve hockey player pose estimation, a novel deep learning computer vision architecture known as the HyperStackNet has been designed and implemented for joint player and stick pose estimation. In addition to improving player pose estimation, the HyperStackNet architecture enables improved transfer learning from pre-trained stacked hourglass networks trained on a different domain. Experimental results demonstrate that when the HyperStackNet is trained to detect 18 different joint positions on a hockey player (including the hockey stick), the accuracy is 98.8% on the test dataset, thus demonstrating its efficacy for handling complex joint player and stick pose estimation from video. Extending from pose recognition, this research involves the development of an algorithm for accurate recognition of actions for hockey. To perform this action recognition, a convolutional neural network estimates actions through unifying latent pose and action recognition. The action recognition hourglass network, or ARHN, is designed to interpret player actions in ice hockey video using estimated pose. ARHN has three components. The first component is the latent pose estimator, the second transforms latent