Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions
Advances in Neural Information Processing Systems 31, pages 10258--10269, 2018 Embedding complex objects as vectors in low dimensional spaces is a longstanding problem in machine learning. We propose in this work an extension of that approach, which consists in embedding objects as elliptical probab...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Advances in Neural Information Processing Systems 31, pages
10258--10269, 2018 Embedding complex objects as vectors in low dimensional spaces is a
longstanding problem in machine learning. We propose in this work an extension
of that approach, which consists in embedding objects as elliptical probability
distributions, namely distributions whose densities have elliptical level sets.
We endow these measures with the 2-Wasserstein metric, with two important
benefits: (i) For such measures, the squared 2-Wasserstein metric has a closed
form, equal to a weighted sum of the squared Euclidean distance between means
and the squared Bures metric between covariance matrices. The latter is a
Riemannian metric between positive semi-definite matrices, which turns out to
be Euclidean on a suitable factor representation of such matrices, which is
valid on the entire geodesic between these matrices. (ii) The 2-Wasserstein
distance boils down to the usual Euclidean metric when comparing Diracs, and
therefore provides a natural framework to extend point embeddings. We show that
for these reasons Wasserstein elliptical embeddings are more intuitive and
yield tools that are better behaved numerically than the alternative choice of
Gaussian embeddings with the Kullback-Leibler divergence. In particular, and
unlike previous work based on the KL geometry, we learn elliptical
distributions that are not necessarily diagonal. We demonstrate the advantages
of elliptical embeddings by using them for visualization, to compute embeddings
of words, and to reflect entailment or hypernymy. |
---|---|
DOI: | 10.48550/arxiv.1805.07594 |