Why Are Deep Representations Good Perceptual Quality Features?
Recently, intermediate feature maps of pre-trained convolutional neural networks have shown significant perceptual quality improvements, when they are used in the loss function for training new networks. It is believed that these features are better at encoding the perceptual quality and provide mor...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recently, intermediate feature maps of pre-trained convolutional neural
networks have shown significant perceptual quality improvements, when they are
used in the loss function for training new networks. It is believed that these
features are better at encoding the perceptual quality and provide more
efficient representations of input images compared to other perceptual metrics
such as SSIM and PSNR. However, there have been no systematic studies to
determine the underlying reason. Due to the lack of such an analysis, it is not
possible to evaluate the performance of a particular set of features or to
improve the perceptual quality even more by carefully selecting a subset of
features from a pre-trained CNN. This work shows that the capabilities of
pre-trained deep CNN features in optimizing the perceptual quality are
correlated with their success in capturing basic human visual perception
characteristics. In particular, we focus our analysis on fundamental aspects of
human perception, such as the contrast sensitivity and orientation selectivity.
We introduce two new formulations to measure the frequency and orientation
selectivity of the features learned by convolutional layers for evaluating deep
features learned by widely-used deep CNNs such as VGG-16. We demonstrate that
the pre-trained CNN features which receive higher scores are better at
predicting human quality judgment. Furthermore, we show the possibility of
using our method to select deep features to form a new loss function, which
improves the image reconstruction quality for the well-known single-image
super-resolution problem. |
---|---|
DOI: | 10.48550/arxiv.1812.00412 |