Large-Scale Video Classification with Feature Space Augmentation coupled with Learned Label Relations and Ensembling

This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733% on the private test set (ranked 3rd among 394 teams, not considering the model size constraint), and 87.287% using a model that meets siz...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2018-09
Hauptverfasser:	Cho, Choongyeun, Antin, Benjamin, Arora, Sanchit, Ashrafi, Shwan, Duan, Peilin, Dang The Huynh, Lee, James, Hang Tuan Nguyen, Solgi, Mojtaba, Cuong Van Than
Format:	Artikel
Sprache:	eng
Schlagworte:	Constraint modelling Performance enhancement Regularization Training Video data
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733% on the private test set (ranked 3rd among 394 teams, not considering the model size constraint), and 87.287% using a model that meets size requirement. Two sets of 7 individual models belonging to 3 different families were trained separately. Then, the inference results on a training data were aggregated from these multiple models and fed to train a compact model that meets the model size requirement. In order to further improve performance we explored and employed data over/sub-sampling in feature space, an additional regularization term during training exploiting label relationship, and learned weights for ensembling different individual models.
ISSN:	2331-8422