Two-Stream Convolutional Neural Network for Video Action Recognition

Video action recognition is widely used in video surveillance, behavior detection, human-computer interaction, medically assisted diagnosis and motion analysis. However, video action recognition can be disturbed by many factors, such as background, illumination and so on. Two-stream convolutional ne...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:KSII transactions on Internet and information systems 2021, 15(10), , pp.3668-3684
Hauptverfasser: Qiao, Han, Liu, Shuang, Xu, Qingzhen, Liu, Shouqiang, Yang, Wanggan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Video action recognition is widely used in video surveillance, behavior detection, human-computer interaction, medically assisted diagnosis and motion analysis. However, video action recognition can be disturbed by many factors, such as background, illumination and so on. Two-stream convolutional neural network uses the video spatial and temporal models to train separately, and performs fusion at the output end. The multi segment Two-Stream convolutional neural network model trains temporal and spatial information from the video to extract their feature and fuse them, then determine the category of video action. Google Xception model and the transfer learning is adopted in this paper, and the Xception model which trained on ImageNet is used as the initial weight. It greatly overcomes the problem of model underfitting caused by insufficient video behavior dataset, and it can effectively reduce the influence of various factors in the video. This way also greatly improves the accuracy and reduces the training time. What's more, to make up for the shortage of dataset, the kinetics400 dataset was used for pre-training, which greatly improved the accuracy of the model. In this applied research, through continuous efforts, the expected goal is basically achieved, and according to the study and research, the design of the original dual-flow model is improved. Keywords: video action recognition, multi segment, two-stream convolutional neural network, transfer learning, pre-training
ISSN:1976-7277
1976-7277
DOI:10.3837/tiis.2021.10.011