From Pixels to Predictions: Spectrogram and Vision Transformer for Better Time Series Forecasting
Time series forecasting plays a crucial role in decision-making across various domains, but it presents significant challenges. Recent studies have explored image-driven approaches using computer vision models to address these challenges, often employing lineplots as the visual representation of tim...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Time series forecasting plays a crucial role in decision-making across
various domains, but it presents significant challenges. Recent studies have
explored image-driven approaches using computer vision models to address these
challenges, often employing lineplots as the visual representation of time
series data. In this paper, we propose a novel approach that uses
time-frequency spectrograms as the visual representation of time series data.
We introduce the use of a vision transformer for multimodal learning,
showcasing the advantages of our approach across diverse datasets from
different domains. To evaluate its effectiveness, we compare our method against
statistical baselines (EMA and ARIMA), a state-of-the-art deep learning-based
approach (DeepAR), other visual representations of time series data (lineplot
images), and an ablation study on using only the time series as input. Our
experiments demonstrate the benefits of utilizing spectrograms as a visual
representation for time series data, along with the advantages of employing a
vision transformer for simultaneous learning in both the time and frequency
domains. |
---|---|
DOI: | 10.48550/arxiv.2403.11047 |