Stochastic Rounding Implicitly Regularizes Tall-and-Thin Matrices

Motivated by the popularity of stochastic rounding in the context of machine learning and the training of large-scale deep neural network models, we consider stochastic nearness rounding of real matrices \(\mathbf{A}\) with many more rows than columns. We provide novel theoretical evidence, supporte...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-12
Hauptverfasser:	Dexter, Gregory, Boutsikas, Christos, Ma, Linkai, Ipsen, Ilse C F, Drineas, Petros
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Machine learning Matrix theory Rounding
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Motivated by the popularity of stochastic rounding in the context of machine learning and the training of large-scale deep neural network models, we consider stochastic nearness rounding of real matrices \(\mathbf{A}\) with many more rows than columns. We provide novel theoretical evidence, supported by extensive experimental evaluation that, with high probability, the smallest singular value of a stochastically rounded matrix is well bounded away from zero -- regardless of how close \(\mathbf{A}\) is to being rank deficient and even if \(\mathbf{A}\) is rank-deficient. In other words, stochastic rounding \textit{implicitly regularizes} tall and skinny matrices \(\mathbf{A}\) so that the rounded version has full column rank. Our proofs leverage powerful results in random matrix theory, and the idea that stochastic rounding errors do not concentrate in low-dimensional column spaces.
ISSN:	2331-8422