A sparse-response deep belief network based on rate distortion theory

Deep belief networks (DBNs) are currently the dominant technique for modeling the architectural depth of brain, and can be trained efficiently in a greedy layer-wise unsupervised learning manner. However, DBNs without a narrow hidden bottleneck typically produce redundant, continuous-valued codes an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2014-09, Vol.47 (9), p.3179-3191
Hauptverfasser: Ji, Nan-Nan, Zhang, Jiang-She, Zhang, Chun-Xia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep belief networks (DBNs) are currently the dominant technique for modeling the architectural depth of brain, and can be trained efficiently in a greedy layer-wise unsupervised learning manner. However, DBNs without a narrow hidden bottleneck typically produce redundant, continuous-valued codes and unstructured weight patterns. Taking inspiration from rate distortion (RD) theory, which encodes original data using as few bits as possible, we introduce in this paper a variant of DBN, referred to as sparse-response DBN (SR-DBN). In this approach, Kullback–Leibler divergence between the distribution of data and the equilibrium distribution defined by the building block of DBN is considered as a distortion function, and the sparse response regularization induced by L1-norm of codes is used to achieve a small code rate. Several experiments by extracting features from different scale image datasets show that our approach SR-DBN learns codes with small rate, extracts features at multiple levels of abstraction mimicking computations in the cortical hierarchy, and obtains more discriminative representation than PCA and several basic algorithms of DBNs. •A novel deep belief network based on rate distortion theory for feature extraction is proposed.•Sparse response regularization induced by L1-norm of codes is used to achieve a small rate.•KL divergence is considered as a distortion function.•Hierarchical representations mimicking computations in the cortical hierarchy are learnt.•More discriminative representation than other algorithms in deep belief networks is yielded.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2014.03.025