UFVL-Net: A Unified Framework for Visual Localization Across Multiple Indoor Scenes
Recently, scene coordinate regression (SCoRe) approaches for visual localization have been extensively investigated. However, current SCoRe methods are scene-specific and necessitate retraining when generalizing new scenarios, leaving a consistent rise in model capacity as the number of scenes incre...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on instrumentation and measurement 2023, Vol.72, p.1-16 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recently, scene coordinate regression (SCoRe) approaches for visual localization have been extensively investigated. However, current SCoRe methods are scene-specific and necessitate retraining when generalizing new scenarios, leaving a consistent rise in model capacity as the number of scenes increases. To this end, we develop UFVL-Net, a unifying framework that integrates localization tasks of multiple indoor scenarios into a manageable network and optimizes these tasks collectively under diversified scene domains, where the localization of each scenario domain is considered a separate task. UFVL-Net is storage-efficient since multiple models with shared parameters can be consolidated into a single one. Specifically, we introduce two parameter sharing policies, that is, channel-wise sharing policy (CSP) and kernel-wise sharing policy, which offer fine-grained parameter sharing within each layer of the backbone for efficient storage while providing task-specific parameters to tackle the inherent hurdles associated with multidomain learning for visual localization, that is, gradient conflict due to a skewed competition among tasks for the shared parameters. The key insight lies in that leveraging task-sharing parameters can learn a generic feature representation across scenes while utilizing task-specific parameters can learn task-related features for alleviating gradient conflict. Moreover, we develop a sign-based gradient normalization (SIGGrad) technique applied to task-sharing parameters to promote the training of UFVL-Net by further mitigating gradient conflict, thus emphasizing the utilization of task-sharing parameters and ensuring that each task is thoroughly optimized. We undertake extensive experiments across numerous datasets and complex real-world scenarios, showing that UFVL-Net families significantly outperform the cutting-edge methods with much less storage space. We demonstrate that UFVL-Net can be generalized to new scenarios using a few task-specific parameters, further highlighting the superiority of UFVL-Net. The code is available at here. |
---|---|
ISSN: | 0018-9456 1557-9662 |
DOI: | 10.1109/TIM.2023.3315406 |