A Tighter Complexity Analysis of SparseGPT
In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^{\omega} + d^{2+a+o(1)} + d^{1+\omega(1,1,a)-a})$ for any $a \in [0, 1]$, where $\omega$ is the exponent of matrix multiplication. In particular, for the current $\omega \app...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we improved the analysis of the running time of SparseGPT
[Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^{\omega} + d^{2+a+o(1)} +
d^{1+\omega(1,1,a)-a})$ for any $a \in [0, 1]$, where $\omega$ is the exponent
of matrix multiplication. In particular, for the current $\omega \approx 2.371$
[Alman, Duan, Williams, Xu, Xu, Zhou 2024], our running time boils down to
$O(d^{2.53})$. This running time is due to the analysis of the lazy update
behavior in iterative maintenance problems such as [Deng, Song, Weinstein 2022;
Brand, Song, Zhou ICML 2024]. |
---|---|
DOI: | 10.48550/arxiv.2408.12151 |