How Expressive are Transformers in Spectral Domain for Graphs?

The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Si...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bastos, Anson, Nadgeri, Abhishek, Singh, Kuldeep, Kanezashi, Hiroki, Suzumura, Toyotaro, Mulang', Isaiah Onando
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Bastos, Anson Nadgeri, Abhishek Singh, Kuldeep Kanezashi, Hiroki Suzumura, Toyotaro Mulang', Isaiah Onando
description	The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).
doi_str_mv	10.48550/arxiv.2201.09332
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2201_09332</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2201_09332</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-194bf03a79f7611ed9ba763af4251f6936b343a9e4bf92db0d4f613684bebd73</originalsourceid><addsrcrecordid>eNotj8FuwjAQRH3hUNF-QE_4B5LaXsfGl6KKUkBC4gD3aN2s1UiEROuK0r9vSnsajWY0mifEo1alnVeVekK-tpfSGKVLFQDMnXje9F9ydR2Ycm4vJJFJHhnPOfXcEWfZnuVhoPdPxpN87Tsc_RjJNePwkRf3YpLwlOnhX6fi8LY6LjfFbr_eLl92BTpvCh1sTArQh-Sd1tSEiN4BJmsqnVwAF8ECBhprwTRRNTY5DW5uI8XGw1TM_lZv_-uB2w75u_7lqG8c8AP9ckJ5</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><source>arXiv.org</source><creator>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</creator><creatorcontrib>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</creatorcontrib><description>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</description><identifier>DOI: 10.48550/arxiv.2201.09332</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2022-01</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2201.09332$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2201.09332$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bastos, Anson</creatorcontrib><creatorcontrib>Nadgeri, Abhishek</creatorcontrib><creatorcontrib>Singh, Kuldeep</creatorcontrib><creatorcontrib>Kanezashi, Hiroki</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Mulang', Isaiah Onando</creatorcontrib><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><description>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FuwjAQRH3hUNF-QE_4B5LaXsfGl6KKUkBC4gD3aN2s1UiEROuK0r9vSnsajWY0mifEo1alnVeVekK-tpfSGKVLFQDMnXje9F9ydR2Ycm4vJJFJHhnPOfXcEWfZnuVhoPdPxpN87Tsc_RjJNePwkRf3YpLwlOnhX6fi8LY6LjfFbr_eLl92BTpvCh1sTArQh-Sd1tSEiN4BJmsqnVwAF8ECBhprwTRRNTY5DW5uI8XGw1TM_lZv_-uB2w75u_7lqG8c8AP9ckJ5</recordid><startdate>20220123</startdate><enddate>20220123</enddate><creator>Bastos, Anson</creator><creator>Nadgeri, Abhishek</creator><creator>Singh, Kuldeep</creator><creator>Kanezashi, Hiroki</creator><creator>Suzumura, Toyotaro</creator><creator>Mulang', Isaiah Onando</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220123</creationdate><title>How Expressive are Transformers in Spectral Domain for Graphs?</title><author>Bastos, Anson ; Nadgeri, Abhishek ; Singh, Kuldeep ; Kanezashi, Hiroki ; Suzumura, Toyotaro ; Mulang', Isaiah Onando</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-194bf03a79f7611ed9ba763af4251f6936b343a9e4bf92db0d4f613684bebd73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bastos, Anson</creatorcontrib><creatorcontrib>Nadgeri, Abhishek</creatorcontrib><creatorcontrib>Singh, Kuldeep</creatorcontrib><creatorcontrib>Kanezashi, Hiroki</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Mulang', Isaiah Onando</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bastos, Anson</au><au>Nadgeri, Abhishek</au><au>Singh, Kuldeep</au><au>Kanezashi, Hiroki</au><au>Suzumura, Toyotaro</au><au>Mulang', Isaiah Onando</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>How Expressive are Transformers in Spectral Domain for Graphs?</atitle><date>2022-01-23</date><risdate>2022</risdate><abstract>The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT).</abstract><doi>10.48550/arxiv.2201.09332</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2201.09332
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2201_09332
source	arXiv.org
subjects	Computer Science - Learning
title	How Expressive are Transformers in Spectral Domain for Graphs?
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T23%3A15%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=How%20Expressive%20are%20Transformers%20in%20Spectral%20Domain%20for%20Graphs?&rft.au=Bastos,%20Anson&rft.date=2022-01-23&rft_id=info:doi/10.48550/arxiv.2201.09332&rft_dat=%3Carxiv_GOX%3E2201_09332%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true