GPS2Vec: Pre-Trained Semantic Embeddings for Worldwide GPS Coordinates

GPS coordinates are fine-grained location indicators that are difficult to be effectively utilized by classifiers in geo-aware applications. Previous GPS encoding methods concentrate on generating hand-crafted features for small areas of interest. However, many real world applications require a mach...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2022, Vol.24, p.890-903
Hauptverfasser:	Yin, Yifang, Zhang, Ying, Liu, Zhenguang, Wang, Sheng, Shah, Rajiv Ratn, Zimmermann, Roger
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Artificial neural networks Coordinates Earth surface Encoding Feature extraction geo-aware applications Global Positioning System GPS Image classification Machine learning Neural networks Real-time systems semantic embedding Semantics Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	GPS coordinates are fine-grained location indicators that are difficult to be effectively utilized by classifiers in geo-aware applications. Previous GPS encoding methods concentrate on generating hand-crafted features for small areas of interest. However, many real world applications require a machine learning model, analogous to the pre-trained ImageNet model for images, that can efficiently generate semantically-enriched features for planet-scale GPS coordinates. To address this issue, we propose a novel two-level grid-based framework, termed GPS2Vec, which is able to extract geo-aware features in real-time for locations worldwide. The Earth's surface is first discretized by the Universal Transverse Mercator (UTM) coordinate system. Each UTM zone is then considered as a local area of interest that is further divided into fine-grained cells to perform the initial GPS encoding. We train a neural network in each UTM zone to learn the semantic embeddings from the initial GPS encoding. The training labels can be automatically derived from large-scale geotagged documents such as tweets, check-ins, and images that are available from social sharing platforms. We conducted comprehensive experiments on three geo-aware applications, namely place semantic annotation, geotagged image classification, and next location prediction. Experimental results demonstrate the effectiveness of our approach, as prediction accuracy improves significantly based on a simple multi-feature early fusion strategy with deep neural networks, including both CNNs and RNNs.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2021.3060951