Scene Categorization Model Using Deep Visually Sensitive Features
Visually sensitive regions in the scene are thought to be important for scene categorization. In this paper, we propose to utilize the important visually sensitive information represented by deep features for scene categorization. Specifically, the context relationship between the objects and the su...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019, Vol.7, p.45230-45239 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Visually sensitive regions in the scene are thought to be important for scene categorization. In this paper, we propose to utilize the important visually sensitive information represented by deep features for scene categorization. Specifically, the context relationship between the objects and the surroundings is fully utilized as the main basis for judging the content of the scene, and combining with the deep convolution neural networks (CNNs), a scene categorization model based on deep visually sensitive features is constructed. First, the saliency regions of the scene images are marked according to the context-based saliency detection algorithm. Then, the original images and the corresponding visually sensitive region detection images are superimposed to obtain the visually sensitive region enhancement images. Second, the deep convolution features of the original images, the visually sensitive region detection images, and the visually sensitive region enhancement images are extracted through the deep CNNs pre-trained on the large-scale scene dataset Places. Finally, considering that the deep features extracted by different layers of the convolution network have different capabilities of discrimination, the fusion features are generated from multiple convolution layers to construct visually sensitive CNN model (VS-CNN). In order to verify the effectiveness of the proposed model, the experiments are conducted on the five standard scene datasets, i.e., LabelMe, UIUC-Sports, Scene-15, MIT67, and SUN. The experimental results show that the proposed model is effective and has good adaptability. Especially, our categorization performance is superior to many state-of the-art methods for a complex indoor scene. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2908448 |