Stacked Autoencoder Based Weak Supervision for Social Image Understanding

Many studies in recent years have focused on social image understanding due to the increasing number of shared images from social networks and online communities. However, previous work in social image understanding fails to learn an effective feature representation because of a large amount of miss...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019, Vol.7, p.21777-21786
Hauptverfasser: Xu, Chaoyang, Dai, Yuanfei, Lin, Renjie, Wang, Shiping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Many studies in recent years have focused on social image understanding due to the increasing number of shared images from social networks and online communities. However, previous work in social image understanding fails to learn an effective feature representation because of a large amount of missing and irrelevant tags, though matrix completion techniques are frequently utilized for this purpose. Autoencoder models have been validated to be effective in learning latent low-dimensional representations in unsupervised learning. In this paper, we propose a new social image understanding model based on deep autoencoders, which can learn the shared latent codes of social images and tags as supervision information in the deep autoencoders. First, social images are extracted with multi-modal features, which provide a comprehensive characterization to image semantic understanding. And, the social image understanding problems are transformed into the problem of minimizing an optimization objective. Second, multi-layered autoencoders with weak supervision integration are employed to learn an efficient low-dimensional representation from the multi-view feature sources that can make up the semantic gap between image features and tags through minimizing the problem formulation. Finally, we design a new balanced loss function based on binary cross entropy, in which we address highly sparse inputs for a better optimization performance. The extensive experiments on several real-world social image datasets confirm the effectiveness and robustness of the proposed model compared with the state-of-the-art methods.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2898991