VIPeR: Visual Incremental Place Recognition with Adaptive Mining and Lifelong Learning
Visual place recognition (VPR) is an essential component of many autonomous and augmented/virtual reality systems. It enables the systems to robustly localize themselves in large-scale environments. Existing VPR methods demonstrate attractive performance at the cost of heavy pre-training and limited...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Visual place recognition (VPR) is an essential component of many autonomous
and augmented/virtual reality systems. It enables the systems to robustly
localize themselves in large-scale environments. Existing VPR methods
demonstrate attractive performance at the cost of heavy pre-training and
limited generalizability. When deployed in unseen environments, these methods
exhibit significant performance drops. Targeting this issue, we present VIPeR,
a novel approach for visual incremental place recognition with the ability to
adapt to new environments while retaining the performance of previous
environments. We first introduce an adaptive mining strategy that balances the
performance within a single environment and the generalizability across
multiple environments. Then, to prevent catastrophic forgetting in lifelong
learning, we draw inspiration from human memory systems and design a novel
memory bank for our VIPeR. Our memory bank contains a sensory memory, a working
memory and a long-term memory, with the first two focusing on the current
environment and the last one for all previously visited environments.
Additionally, we propose a probabilistic knowledge distillation to explicitly
safeguard the previously learned knowledge. We evaluate our proposed VIPeR on
three large-scale datasets, namely Oxford Robotcar, Nordland, and TartanAir.
For comparison, we first set a baseline performance with naive finetuning.
Then, several more recent lifelong learning methods are compared. Our VIPeR
achieves better performance in almost all aspects with the biggest improvement
of 13.65% in average performance. |
---|---|
DOI: | 10.48550/arxiv.2407.21416 |