Image Aesthetics Assessment with Attribute-Assisted Multimodal Memory Network

Image aesthetics assessment (IAA) has attracted growing interest in recent years but is still challenging due to its highly abstract nature. Nowadays, more and more people tend to comment images shared on the social networks, which can provide rich aesthetics-aware semantic information from differen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems for video technology 2023-12, Vol.33 (12), p.1-1
Hauptverfasser:	Li, Leida, Zhu, Tong, Chen, Pengfei, Yang, Yuzhe, Li, Yaqian, Lin, Weisi
Format:	Artikel
Sprache:	eng
Schlagworte:	Aesthetics Electronic mail Feature extraction Image aesthetics assessment Image color analysis Image enhancement Image quality Image quality assessment Multimodal analysis Representation learning Representations Semantics Social networks Source code Task analysis Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Image aesthetics assessment (IAA) has attracted growing interest in recent years but is still challenging due to its highly abstract nature. Nowadays, more and more people tend to comment images shared on the social networks, which can provide rich aesthetics-aware semantic information from different aspects. Therefore, user comments of an image can be exploited as supplementary information for enhancing aesthetic representation learning. Previous researches have demonstrated that aesthetic attributes make significant effect on image aesthetic quality and humans' aesthetic perception. Typically, people are used to give comments on an image from the perspective of aesthetic attributes, based on which the aesthetic quality of images can be inferred. Motivated by this, this paper presents an Attribute-assisted Multimodal Memory Network (AMM-Net) for image aesthetics assessment, which utilizes aesthetic attributes to model the interactions between visual and textual modalities. Specifically, we design two memory networks to capture the attribute-aware information most related to the image and associated comments respectively. Further, with multiple memory hops, attribute semantics shared by the two modalities are refined and cross-modal interactions are enhanced progressively. Finally, more discriminative aesthetic representations can be obtained for IAA. The experimental results and comparisons on two public multimodal IAA datasets demonstrate the superiority of the proposed model over the state-of-the-art methods. The source code is available at https://github.com/zhutong0219/AMM-Net.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2023.3272984