Towards a Knowledge guided Multimodal Foundation Model for Spatio-Temporal Remote Sensing Applications
In recent years, there has been an increased interest in foundation models for geoscience due to the vast amount of Earth observing satellite imagery. Existing remote sensing foundation models make use of the various sources of spectral imagery to create large models pretrained on the task of masked...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, there has been an increased interest in foundation models
for geoscience due to the vast amount of Earth observing satellite imagery.
Existing remote sensing foundation models make use of the various sources of
spectral imagery to create large models pretrained on the task of masked
reconstruction. In this paper, we present a foundation model framework, where
the pretraining task captures the causal relationship between multiple
modalities. Our framework leverages the knowledge guided principles that the
spectral imagery captures the impact of the physical drivers on the
environmental system, and that the relationship between them is governed by the
characteristics of the system. Specifically, our method, called MultiModal
Variable Step Forecasting (MM-VSF), uses forecasting of satellite imagery as a
pretraining task and is able to capture the causal relationship between
spectral imagery and weather. In our evaluation we show that the forecasting of
satellite imagery using weather can be used as an effective pretraining task
for foundation models. We further show the effectiveness of the embeddings
produced by MM-VSF on the downstream tasks of pixel wise crop mapping and
missing image prediction of spectral imagery, when compared with embeddings
created by models trained in alternative pretraining settings including the
traditional single modality input masked reconstruction. |
---|---|
DOI: | 10.48550/arxiv.2407.19660 |