CIG2S: A Cross-View Image Geo-Localization Model Based on G2S Transform Suitable for Center-Misaligned Scenarios

In multimedia social networks, the user's geolocation can be inferred by matching his shared images with the referenced satellite images, viz. cross-view image geo-localization. Although the existing most cross-view image geo-localization methods perform well in the center-misaligned scenario,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computational social systems 2024-10, p.1-16
Hauptverfasser: Li, Jiangshan, Yang, Chunfang, Qi, Baojun, Zhu, Ma, Chen, Junyang, Leung, Victor C. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In multimedia social networks, the user's geolocation can be inferred by matching his shared images with the referenced satellite images, viz. cross-view image geo-localization. Although the existing most cross-view image geo-localization methods perform well in the center-misaligned scenario, in practical application, the shooting location of the query ground image is most likely not aligned with the center point of satellite images. Then, their geo-localization accuracy would drastically decrease. Therefore, we propose a novel cross-view image geo-localization model based on ground-to-satellite (G2S) transform, named CIG2S. First, the queried ground image is transformed into the aerial-view by spherical transform, generating G2S images, which could improve the similarity between ground and satellite images. Second, multiscale features are extracted from the original ground image, G2S images, and satellite images by twins-PCPVT. Furthermore, a dynamic similarity weighted loss function is designed to measure the distance between the query ground image and the referenced satellite image. Experimental results on three center-misaligned datasets, including VIGOR and the center-misaligned versions of CVUSA and CVACT, demonstrate that the proposed CIG2S model can significantly improve the geo-localization accuracy. For example, when compared with another vision-transformer-based model L2LTR-polar, CIG2S can outperform about 6.6% and 15.8% in the center-misaligned datasets CVUSA_CM and CVACT_CM.
ISSN:2329-924X
2373-7476
DOI:10.1109/TCSS.2024.3465539