Correlated Attention in Transformers for Multivariate Time Series
Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture t...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multivariate time series (MTS) analysis prevails in real-world applications
such as finance, climate science and healthcare. The various self-attention
mechanisms, the backbone of the state-of-the-art Transformer-based models,
efficiently discover the temporal dependencies, yet cannot well capture the
intricate cross-correlation between different features of MTS data, which
inherently stems from complex dynamical systems in practice. To this end, we
propose a novel correlated attention mechanism, which not only efficiently
captures feature-wise dependencies, but can also be seamlessly integrated
within the encoder blocks of existing well-known Transformers to gain
efficiency improvement. In particular, correlated attention operates across
feature channels to compute cross-covariance matrices between queries and keys
with different lag values, and selectively aggregate representations at the
sub-series level. This architecture facilitates automated discovery and
representation learning of not only instantaneous but also lagged
cross-correlations, while inherently capturing time series auto-correlation.
When combined with prevalent Transformer baselines, correlated attention
mechanism constitutes a better alternative for encoder-only architectures,
which are suitable for a wide range of tasks including imputation, anomaly
detection and classification. Extensive experiments on the aforementioned tasks
consistently underscore the advantages of correlated attention mechanism in
enhancing base Transformer models, and demonstrate our state-of-the-art results
in imputation, anomaly detection and classification. |
---|---|
DOI: | 10.48550/arxiv.2311.11959 |