Enhancing User Experience in On-Device Machine Learning with Gated Compression Layers
On-device machine learning (ODML) enables powerful edge applications, but power consumption remains a key challenge for resource-constrained devices. To address this, developers often face a trade-off between model accuracy and power consumption, employing either computationally intensive models on...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | On-device machine learning (ODML) enables powerful edge applications, but
power consumption remains a key challenge for resource-constrained devices. To
address this, developers often face a trade-off between model accuracy and
power consumption, employing either computationally intensive models on
high-power cores or pared-down models on low-power cores. Both approaches
typically lead to a compromise in user experience (UX). This work focuses on
the use of Gated Compression (GC) layer to enhance ODML model performance while
conserving power and maximizing cost-efficiency, especially for always-on use
cases. GC layers dynamically regulate data flow by selectively gating
activations of neurons within the neural network and effectively filtering out
non-essential inputs, which reduces power needs without compromising accuracy,
and enables more efficient execution on heterogeneous compute cores. These
improvements enhance UX through prolonged battery life, improved device
responsiveness, and greater user comfort. In this work, we have integrated GC
layers into vision and speech domain models including the transformer-based ViT
model. Our experiments demonstrate theoretical power efficiency gains ranging
from 158x to 30,000x for always-on scenarios. This substantial improvement
empowers ODML applications with enhanced UX benefits. |
---|---|
DOI: | 10.48550/arxiv.2405.01739 |