LCCStyle: Arbitrary Style Transfer With Low Computational Complexity

Surprising performance has been achieved in style transfer since deep learning was introduced to it. However, the existing state-of-the-art (SOTA) algorithms either suffer from quality issues or high computational complexity. The quality issues include shape retention and the adequacy of style migra...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2023, Vol.25, p.501-514
Hauptverfasser: Huang, Yujie, Jing, Minge, Zhou, Jinjia, Liu, Yuhao, Fan, Yibo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Surprising performance has been achieved in style transfer since deep learning was introduced to it. However, the existing state-of-the-art (SOTA) algorithms either suffer from quality issues or high computational complexity. The quality issues include shape retention and the adequacy of style migration, and the computational complexity is reflected in the network complexity and additional updates when the style changes. To deal with the above problems, we propose a novel low computational complexity arbitrary style transfer algorithm (LCCStyle) that mainly consists of a transformation feature module (TFM) and learning transformation module (LTM). The TFM is responsible for transforming the content feature map into the stylized feature map without impact on the integrity of content information, which contributes to good shape retention and full style migration. In addition, to avoid additional updates when the style changes, we propose a new training mechanism for arbitrary style transfer to directly generate the parameters of the TFM by a hyper-network. However, the widely used hyper-networks are composed of fully connected layers, which cause a large number of parameters. Hence, we designed a hyper-network (LTM) consisting of one-dimensional convolution to adapt to the characteristics of the Gram matrix of the style feature map, contributing to a small model size and having no impact on quality. Quantitative comparison and user study show that LCCStyle achieves high performance both on the adequacy of style migration and shape retention. Furthermore, compared with the SOTAs, the size of the proposed model is reduced by a large margin of nearly 51.4%\sim99.6%. When the input is 512×512 pixels, the processing speeds in the cases of unchanged style and constantly changing style are increased by at least 135% and 227%, respectively. On an Nvidia TITAN RTX GPU, LCCStyle reaches 60fps for 720p video and takes only 1 s to process 8 K images. https://github.com/HuangYujie94/LCCStyle .
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2021.3128058