CIG2S: A Cross-View Image Geo-Localization Model Based on G2S Transform Suitable for Center-Misaligned Scenarios
In multimedia social networks, the user's geolocation can be inferred by matching his shared images with the referenced satellite images, viz. cross-view image geo-localization. Although the existing most cross-view image geo-localization methods perform well in the center-misaligned scenario,...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computational social systems 2024-10, p.1-16 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In multimedia social networks, the user's geolocation can be inferred by matching his shared images with the referenced satellite images, viz. cross-view image geo-localization. Although the existing most cross-view image geo-localization methods perform well in the center-misaligned scenario, in practical application, the shooting location of the query ground image is most likely not aligned with the center point of satellite images. Then, their geo-localization accuracy would drastically decrease. Therefore, we propose a novel cross-view image geo-localization model based on ground-to-satellite (G2S) transform, named CIG2S. First, the queried ground image is transformed into the aerial-view by spherical transform, generating G2S images, which could improve the similarity between ground and satellite images. Second, multiscale features are extracted from the original ground image, G2S images, and satellite images by twins-PCPVT. Furthermore, a dynamic similarity weighted loss function is designed to measure the distance between the query ground image and the referenced satellite image. Experimental results on three center-misaligned datasets, including VIGOR and the center-misaligned versions of CVUSA and CVACT, demonstrate that the proposed CIG2S model can significantly improve the geo-localization accuracy. For example, when compared with another vision-transformer-based model L2LTR-polar, CIG2S can outperform about 6.6% and 15.8% in the center-misaligned datasets CVUSA_CM and CVACT_CM. |
---|---|
ISSN: | 2329-924X 2373-7476 |
DOI: | 10.1109/TCSS.2024.3465539 |