ERDSE: efficient reinforcement learning based design space exploration method for CNN accelerator on resource limited platform
•ERDSE is a RL based DSE method for the complex design space of CNN accelerator.•Off policy strategy is applied to decouple sampl ing phase and learning phase of RL.•Sampling phase and learning phase are separately refined to improve the efficiency.•Noise disturbance utilized in s ampl ing phase can...
Gespeichert in:
Veröffentlicht in: | Graphics & visual computing 2021-06, Vol.4, p.200024, Article 200024 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •ERDSE is a RL based DSE method for the complex design space of CNN accelerator.•Off policy strategy is applied to decouple sampl ing phase and learning phase of RL.•Sampling phase and learning phase are separately refined to improve the efficiency.•Noise disturbance utilized in s ampl ing phase can promote the exploration ability.•Maximum return approximate d gradient enhances sample utilization of learning phase.
Convolutional Neural Network (CNN) accelerator design on resource limited platform faces the challenge of lacking efficient design space exploration (DSE) method because of its huge and irregular design space. Numerous parameters belong to accelerator architecture and dataflow mode jointly construct a huge design space while power and resource constrains make the design space become quite irregular. Under such circumstances, traditional DSE methods based on exhaustive search is infeasible for the non-trivial design space and methods based on general optimization algorithms will also be inefficient because of the irregular distribution of design points. In this paper, we provide an efficient DSE method named ERDSE for CNN accelerator design on resource limited platform. ERDSE is based on reinforcement learning algorithm REINFORCE but refines it to adapt the complex design space. ERDSE implements off-policy strategy to decouple sampling and learning phase, then separately refines them to further improve exploration ability and samples utilization. We implement ERDSE to optimize the computing latency of CNN accelerator for VGG-16 and MobileNet-V3. Under the tightest constraints, ERDSE achieves 1.2x-1.7x (on VGG-16) and 2.3-4.9x (on MobileNet-V3) latency improvement compared with other DSE methods, which demonstrates the efficiency of ERDSE. |
---|---|
ISSN: | 2666-6294 2666-6294 |
DOI: | 10.1016/j.gvc.2021.200024 |