Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
The rapid advancement of deep neural networks has significantly improved various tasks, such as image and speech recognition. However, as the complexity of these models increases, so does the computational cost and the number of parameters, making it difficult to deploy them on resource-constrained...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The rapid advancement of deep neural networks has significantly improved
various tasks, such as image and speech recognition. However, as the complexity
of these models increases, so does the computational cost and the number of
parameters, making it difficult to deploy them on resource-constrained devices.
This paper proposes a novel memorization-based inference (MBI) that is compute
free and only requires lookups. Specifically, our work capitalizes on the
inference mechanism of the recurrent attention model (RAM), where only a small
window of input domain (glimpse) is processed in a one time step, and the
outputs from multiple glimpses are combined through a hidden vector to
determine the overall classification output of the problem. By leveraging the
low-dimensionality of glimpse, our inference procedure stores key value pairs
comprising of glimpse location, patch vector, etc. in a table. The computations
are obviated during inference by utilizing the table to read out key-value
pairs and performing compute-free inference by memorization. By exploiting
Bayesian optimization and clustering, the necessary lookups are reduced, and
accuracy is improved. We also present in-memory computing circuits to quickly
look up the matching key vector to an input query. Compared to competitive
compute-in-memory (CIM) approaches, MBI improves energy efficiency by almost
2.7 times than multilayer perceptions (MLP)-CIM and by almost 83 times than
ResNet20-CIM for MNIST character recognition. |
---|---|
DOI: | 10.48550/arxiv.2307.07631 |