TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration
Active Visual Exploration (AVE) optimizes the utilization of robotic resources in real-world scenarios by sequentially selecting the most informative observations. However, modern methods require a high computational budget due to processing the same observations multiple times through the autoencod...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Active Visual Exploration (AVE) optimizes the utilization of robotic
resources in real-world scenarios by sequentially selecting the most
informative observations. However, modern methods require a high computational
budget due to processing the same observations multiple times through the
autoencoder transformers.
As a remedy, we introduce a novel approach to AVE called TOken REcycling
(TORE). It divides the encoder into extractor and aggregator components. The
extractor processes each observation separately, enabling the reuse of tokens
passed to the aggregator. Moreover, to further reduce the computations, we
decrease the decoder to only one block.
Through extensive experiments, we demonstrate that TORE outperforms
state-of-the-art methods while reducing computational overhead by up to 90\%. |
---|---|
DOI: | 10.48550/arxiv.2311.15335 |