Scene Recognition by Joint Learning of DNN from Bag of Visual Words and Convolutional DCT Features

Scene recognition is used in many computer vision and related applications, including information retrieval, robotics, real-time monitoring, and event-classification. Due to the complex nature of the task of scene recognition, it has been greatly improved by deep learning architectures that can be t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied artificial intelligence 2021-07, Vol.35 (9), p.623-641
Hauptverfasser: Rehman, Abdul, Saleem, Summra, Khan, Usman Ghani, Jabeen, Saira, Shafiq, M. Omair
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Scene recognition is used in many computer vision and related applications, including information retrieval, robotics, real-time monitoring, and event-classification. Due to the complex nature of the task of scene recognition, it has been greatly improved by deep learning architectures that can be trained by utilizing large and comprehensive datasets. This paper presents a scene classification method in which local and global features are used and are concatenated with the DCT-Convolutional features of AlexNet. The features are fed into AlexNet's fully connected layers for classification. The local and global features are made efficient by selecting the correct size of Bag of Visual Words (BOVW) and feature selection techniques, which are evaluated in the experimentation section. We used AlexNet with the modification of adding additional dense fully connected layers and compared its result with the model previously trained on the Places365 dataset. Our model is also compared with other scene recognition methods, and it clearly outperforms in terms of accuracy.
ISSN:0883-9514
1087-6545
DOI:10.1080/08839514.2021.1881296