Mode combinability: Exploring convex combinations of permutation aligned models

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors ΘA and ΘB of size d. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]d and its vicinity. Our findings...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural networks 2024-05, Vol.173, p.106204-106204, Article 106204
Hauptverfasser: Csiszárik, Adrián, Kiss, Melinda F., Kőrösi-Szabó, Péter, Muntag, Márton, Papp, Gergely, Varga, Dániel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors ΘA and ΘB of size d. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]d and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models.
ISSN:0893-6080
1879-2782
DOI:10.1016/j.neunet.2024.106204