ACCURACY-PRESERVING DEEP MODEL COMPRESSION

Techniques described herein provide for compression of machine learning models without significant loss in model accuracy and without requiring model re-training. Compressed machine learning models may then be deployed by resource-constrained devices to improve operational efficiency and throughput....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: GARG, Yash, AKYAMAC, Ahmet
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Techniques described herein provide for compression of machine learning models without significant loss in model accuracy and without requiring model re-training. Compressed machine learning models may then be deployed by resource-constrained devices to improve operational efficiency and throughput. An example method includes providing input data for one or more deep learning tasks to a machine learning model having a plurality of neuronal units. The neuronal units are associated with respective parameters. The method further includes determination of respective confidence scores for the plurality of neuronal units responsive to the input data. A confidence score represents a contribution, significant, or impact of a neuronal unit with respect to the overall model output. The method further includes generating a compressed machine learning model based at least in part on removing a subset of neuronal units according to their respective confidence scores and redistributing their parameters to another subset of neuronal units.