Distributed Mean Estimation with Limited Communication
Motivated by the need for distributed learning and optimization algorithms with low communication cost, we study communication efficient algorithms for distributed mean estimation. Unlike previous works, we make no probabilistic assumptions on the data. We first show that for $d$ dimensional data wi...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Motivated by the need for distributed learning and optimization algorithms
with low communication cost, we study communication efficient algorithms for
distributed mean estimation. Unlike previous works, we make no probabilistic
assumptions on the data. We first show that for $d$ dimensional data with $n$
clients, a naive stochastic binary rounding approach yields a mean squared
error (MSE) of $\Theta(d/n)$ and uses a constant number of bits per dimension
per client. We then extend this naive algorithm in two ways: we show that
applying a structured random rotation before quantization reduces the error to
$\mathcal{O}((\log d)/n)$ and a better coding strategy further reduces the
error to $\mathcal{O}(1/n)$ and uses a constant number of bits per dimension
per client. We also show that the latter coding strategy is optimal up to a
constant in the minimax sense i.e., it achieves the best MSE for a given
communication cost. We finally demonstrate the practicality of our algorithms
by applying them to distributed Lloyd's algorithm for k-means and power
iteration for PCA. |
---|---|
DOI: | 10.48550/arxiv.1611.00429 |