Leveraging Expert Feature Knowledge for Predicting Image Aesthetics

The ability to rank the images based on their appearance finds many real-world applications, such as image retrieval or image album creation. Despite the recent dominance of deep learning methods in computer vision which often result in superior performance, they are not always the methods of choice...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing 2018-10, Vol.27 (10), p.5100-5112
Hauptverfasser: Kucer, Michal, Loui, Alexander C., Messinger, David W.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The ability to rank the images based on their appearance finds many real-world applications, such as image retrieval or image album creation. Despite the recent dominance of deep learning methods in computer vision which often result in superior performance, they are not always the methods of choice because they lack interpretability. In this paper, we investigate the possibility of improving the image aesthetic inference of the convolutional neural networks with hand-designed features that rely on domain expertise in various fields. We perform a comparison of hand-crafted feature sets in their ability to predict fine-grained aesthetics scores on two image aesthetics data sets. We observe that even feature sets published earlier are able to compete with more recently published algorithms and, by combining the algorithms, a significant improvement in predicting image aesthetics is possible. By using a tree-based learner, we perform the feature elimination to understand the best performing features overall and across different image categories. Only roughly 15% and 8% of the features are needed to achieve full performance in predicting a fine-grained aesthetic score and binary classification, respectively. By combining the hand-crafted features with metafeatures that predict the quality of an image based on convolutional neural network features, the model performs better than a baseline VGG16 model. One can, however, achieve more significant improvement in both aesthetics score prediction and binary classification by fusing the hand-crafted features and the penultimate layer activations. Our experiments indicate an improvement up to 2.2% achieving current state-of-the-art binary classification accuracy on the aesthetic visual analysis data set when the hand-designed features are fused with activations from VGG16 and ResNet50 networks.
ISSN:1057-7149
1941-0042
DOI:10.1109/TIP.2018.2845100