SYSTEM AND METHOD FOR PRUNING FILTERS IN DEEP NEURAL NETWORKS

An apparatus is provided to compress DNNs using filter pruning on a per-group basis. For example, the apparatus accesses a trained DNN that includes a plurality of layers. The apparatus generates a sequential graph representation of the plurality of layers. The sequential graph representation includ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Phielipp, Mariano J, Miret, Santiago, Chua, Vui Seng, Jain, Nilesh
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An apparatus is provided to compress DNNs using filter pruning on a per-group basis. For example, the apparatus accesses a trained DNN that includes a plurality of layers. The apparatus generates a sequential graph representation of the plurality of layers. The sequential graph representation includes a sequence of nodes. Each node is a graph representation of a layer. The apparatus clusters the layers into layer groups. A layer group includes one or more layers. The apparatus determines a pruning ratio for a layer group and prunes the filters of the layers in the layer group based on the pruning ratio. The apparatus may cluster the layers and determine the pruning ratio by using a GNN. The apparatus generates compressed layers from the layers in the layer group through the filter pruning process. The apparatus further updates the DNN by replacing the layers in the layer group with the compressed layers.