How Expressive are Transformers in Spectral Domain for Graphs?

The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Si...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Bastos, Anson, Nadgeri, Abhishek, Singh, Kuldeep, Kanezashi, Hiroki, Suzumura, Toyotaro, Mulang', Isaiah Onando
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Bastos, Anson
Nadgeri, Abhishek
Singh, Kuldeep
Kanezashi, Hiroki
Suzumura, Toyotaro
Mulang', Isaiah Onando
description The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).
doi_str_mv 10.48550/arxiv.2201.09332
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2201_09332</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2201_09332</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-194bf03a79f7611ed9ba763af4251f6936b343a9e4bf92db0d4f613684bebd73</originalsourceid><addsrcrecordid>eNotj8FuwjAQRH3hUNF-QE_4B5LaXsfGl6KKUkBC4gD3aN2s1UiEROuK0r9vSnsajWY0mifEo1alnVeVekK-tpfSGKVLFQDMnXje9F9ydR2Ycm4vJJFJHhnPOfXcEWfZnuVhoPdPxpN87Tsc_RjJNePwkRf3YpLwlOnhX6fi8LY6LjfFbr_eLl92BTpvCh1sTArQh-Sd1tSEiN4BJmsqnVwAF8ECBhprwTRRNTY5DW5uI8XGw1TM_lZv_-uB2w75u_7lqG8c8AP9ckJ5</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><source>arXiv.org</source><creator>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</creator><creatorcontrib>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</creatorcontrib><description>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</description><identifier>DOI: 10.48550/arxiv.2201.09332</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2022-01</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2201.09332$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2201.09332$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bastos, Anson</creatorcontrib><creatorcontrib>Nadgeri, Abhishek</creatorcontrib><creatorcontrib>Singh, Kuldeep</creatorcontrib><creatorcontrib>Kanezashi, Hiroki</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Mulang', Isaiah Onando</creatorcontrib><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><description>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FuwjAQRH3hUNF-QE_4B5LaXsfGl6KKUkBC4gD3aN2s1UiEROuK0r9vSnsajWY0mifEo1alnVeVekK-tpfSGKVLFQDMnXje9F9ydR2Ycm4vJJFJHhnPOfXcEWfZnuVhoPdPxpN87Tsc_RjJNePwkRf3YpLwlOnhX6fi8LY6LjfFbr_eLl92BTpvCh1sTArQh-Sd1tSEiN4BJmsqnVwAF8ECBhprwTRRNTY5DW5uI8XGw1TM_lZv_-uB2w75u_7lqG8c8AP9ckJ5</recordid><startdate>20220123</startdate><enddate>20220123</enddate><creator>Bastos, Anson</creator><creator>Nadgeri, Abhishek</creator><creator>Singh, Kuldeep</creator><creator>Kanezashi, Hiroki</creator><creator>Suzumura, Toyotaro</creator><creator>Mulang', Isaiah Onando</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220123</creationdate><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><author>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-194bf03a79f7611ed9ba763af4251f6936b343a9e4bf92db0d4f613684bebd73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bastos, Anson</creatorcontrib><creatorcontrib>Nadgeri, Abhishek</creatorcontrib><creatorcontrib>Singh, Kuldeep</creatorcontrib><creatorcontrib>Kanezashi, Hiroki</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Mulang', Isaiah Onando</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bastos, Anson</au><au>Nadgeri, Abhishek</au><au>Singh, Kuldeep</au><au>Kanezashi, Hiroki</au><au>Suzumura, Toyotaro</au><au>Mulang', Isaiah Onando</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>How Expressive are Transformers in Spectral Domain for Graphs?</atitle><date>2022-01-23</date><risdate>2022</risdate><abstract>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</abstract><doi>10.48550/arxiv.2201.09332</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2201.09332
ispartof
issn
language eng
recordid cdi_arxiv_primary_2201_09332
source arXiv.org
subjects Computer Science - Learning
title How Expressive are Transformers in Spectral Domain for Graphs?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T23%3A15%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=How%20Expressive%20are%20Transformers%20in%20Spectral%20Domain%20for%20Graphs?&rft.au=Bastos,%20Anson&rft.date=2022-01-23&rft_id=info:doi/10.48550/arxiv.2201.09332&rft_dat=%3Carxiv_GOX%3E2201_09332%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true