Mixture of Weak & Strong Experts on Graphs
Realistic graphs contain both (1) rich self-features of nodes and (2) informative structures of neighborhoods, jointly handled by a Graph Neural Network (GNN) in the typical setup. We propose to decouple the two modalities by Mixture of weak and strong experts (Mowst), where the weak expert is a lig...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Zeng, Hanqing Lyu, Hanjia Hu, Diyi Xia, Yinglong Luo, Jiebo |
description | Realistic graphs contain both (1) rich self-features of nodes and (2)
informative structures of neighborhoods, jointly handled by a Graph Neural
Network (GNN) in the typical setup. We propose to decouple the two modalities
by Mixture of weak and strong experts (Mowst), where the weak expert is a
light-weight Multi-layer Perceptron (MLP), and the strong expert is an
off-the-shelf GNN. To adapt the experts' collaboration to different target
nodes, we propose a "confidence" mechanism based on the dispersion of the weak
expert's prediction logits. The strong expert is conditionally activated in the
low-confidence region when either the node's classification relies on
neighborhood information, or the weak expert has low model quality. We reveal
interesting training dynamics by analyzing the influence of the confidence
function on loss: our training algorithm encourages the specialization of each
expert by effectively generating soft splitting of the graph. In addition, our
"confidence" design imposes a desirable bias toward the strong expert to
benefit from GNN's better generalization capability. Mowst is easy to optimize
and achieves strong expressive power, with a computation cost comparable to a
single GNN. Empirically, Mowst on 4 backbone GNN architectures show significant
accuracy improvement on 6 standard node classification benchmarks, including
both homophilous and heterophilous graphs
(https://github.com/facebookresearch/mowst-gnn). |
doi_str_mv | 10.48550/arxiv.2311.05185 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_05185</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_05185</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-84fd58cd1614e535e41801739f647acc17c1d1abb24522dbe285afadb4e0fce43</originalsourceid><addsrcrecordid>eNotzrFOwzAQgGEvDKj0AZjqqQNSgs_2JWasqtIiFXVopY7RxT6XCGgiJ6Dw9qiF6d9-fULcg8qtQ1SPlMbmO9cGIFcIDm_Fw2szDl-JZRvlkeldzuV-SO35JFdjx2noZXuW60TdW38nbiJ99Dz970QcnleH5Sbb7tYvy8U2o6LEzNkY0PkABVhGg2zBKSjNUyxsSd5D6SEA1bW2qHWoWTukSKG2rKJnayZi9re9YqsuNZ-UfqoLurqizS_gujsj</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Mixture of Weak & Strong Experts on Graphs</title><source>arXiv.org</source><creator>Zeng, Hanqing ; Lyu, Hanjia ; Hu, Diyi ; Xia, Yinglong ; Luo, Jiebo</creator><creatorcontrib>Zeng, Hanqing ; Lyu, Hanjia ; Hu, Diyi ; Xia, Yinglong ; Luo, Jiebo</creatorcontrib><description>Realistic graphs contain both (1) rich self-features of nodes and (2)
informative structures of neighborhoods, jointly handled by a Graph Neural
Network (GNN) in the typical setup. We propose to decouple the two modalities
by Mixture of weak and strong experts (Mowst), where the weak expert is a
light-weight Multi-layer Perceptron (MLP), and the strong expert is an
off-the-shelf GNN. To adapt the experts' collaboration to different target
nodes, we propose a "confidence" mechanism based on the dispersion of the weak
expert's prediction logits. The strong expert is conditionally activated in the
low-confidence region when either the node's classification relies on
neighborhood information, or the weak expert has low model quality. We reveal
interesting training dynamics by analyzing the influence of the confidence
function on loss: our training algorithm encourages the specialization of each
expert by effectively generating soft splitting of the graph. In addition, our
"confidence" design imposes a desirable bias toward the strong expert to
benefit from GNN's better generalization capability. Mowst is easy to optimize
and achieves strong expressive power, with a computation cost comparable to a
single GNN. Empirically, Mowst on 4 backbone GNN architectures show significant
accuracy improvement on 6 standard node classification benchmarks, including
both homophilous and heterophilous graphs
(https://github.com/facebookresearch/mowst-gnn).</description><identifier>DOI: 10.48550/arxiv.2311.05185</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2023-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.05185$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.05185$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zeng, Hanqing</creatorcontrib><creatorcontrib>Lyu, Hanjia</creatorcontrib><creatorcontrib>Hu, Diyi</creatorcontrib><creatorcontrib>Xia, Yinglong</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><title>Mixture of Weak & Strong Experts on Graphs</title><description>Realistic graphs contain both (1) rich self-features of nodes and (2)
informative structures of neighborhoods, jointly handled by a Graph Neural
Network (GNN) in the typical setup. We propose to decouple the two modalities
by Mixture of weak and strong experts (Mowst), where the weak expert is a
light-weight Multi-layer Perceptron (MLP), and the strong expert is an
off-the-shelf GNN. To adapt the experts' collaboration to different target
nodes, we propose a "confidence" mechanism based on the dispersion of the weak
expert's prediction logits. The strong expert is conditionally activated in the
low-confidence region when either the node's classification relies on
neighborhood information, or the weak expert has low model quality. We reveal
interesting training dynamics by analyzing the influence of the confidence
function on loss: our training algorithm encourages the specialization of each
expert by effectively generating soft splitting of the graph. In addition, our
"confidence" design imposes a desirable bias toward the strong expert to
benefit from GNN's better generalization capability. Mowst is easy to optimize
and achieves strong expressive power, with a computation cost comparable to a
single GNN. Empirically, Mowst on 4 backbone GNN architectures show significant
accuracy improvement on 6 standard node classification benchmarks, including
both homophilous and heterophilous graphs
(https://github.com/facebookresearch/mowst-gnn).</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrFOwzAQgGEvDKj0AZjqqQNSgs_2JWasqtIiFXVopY7RxT6XCGgiJ6Dw9qiF6d9-fULcg8qtQ1SPlMbmO9cGIFcIDm_Fw2szDl-JZRvlkeldzuV-SO35JFdjx2noZXuW60TdW38nbiJ99Dz970QcnleH5Sbb7tYvy8U2o6LEzNkY0PkABVhGg2zBKSjNUyxsSd5D6SEA1bW2qHWoWTukSKG2rKJnayZi9re9YqsuNZ-UfqoLurqizS_gujsj</recordid><startdate>20231109</startdate><enddate>20231109</enddate><creator>Zeng, Hanqing</creator><creator>Lyu, Hanjia</creator><creator>Hu, Diyi</creator><creator>Xia, Yinglong</creator><creator>Luo, Jiebo</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231109</creationdate><title>Mixture of Weak & Strong Experts on Graphs</title><author>Zeng, Hanqing ; Lyu, Hanjia ; Hu, Diyi ; Xia, Yinglong ; Luo, Jiebo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-84fd58cd1614e535e41801739f647acc17c1d1abb24522dbe285afadb4e0fce43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Zeng, Hanqing</creatorcontrib><creatorcontrib>Lyu, Hanjia</creatorcontrib><creatorcontrib>Hu, Diyi</creatorcontrib><creatorcontrib>Xia, Yinglong</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zeng, Hanqing</au><au>Lyu, Hanjia</au><au>Hu, Diyi</au><au>Xia, Yinglong</au><au>Luo, Jiebo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mixture of Weak & Strong Experts on Graphs</atitle><date>2023-11-09</date><risdate>2023</risdate><abstract>Realistic graphs contain both (1) rich self-features of nodes and (2)
informative structures of neighborhoods, jointly handled by a Graph Neural
Network (GNN) in the typical setup. We propose to decouple the two modalities
by Mixture of weak and strong experts (Mowst), where the weak expert is a
light-weight Multi-layer Perceptron (MLP), and the strong expert is an
off-the-shelf GNN. To adapt the experts' collaboration to different target
nodes, we propose a "confidence" mechanism based on the dispersion of the weak
expert's prediction logits. The strong expert is conditionally activated in the
low-confidence region when either the node's classification relies on
neighborhood information, or the weak expert has low model quality. We reveal
interesting training dynamics by analyzing the influence of the confidence
function on loss: our training algorithm encourages the specialization of each
expert by effectively generating soft splitting of the graph. In addition, our
"confidence" design imposes a desirable bias toward the strong expert to
benefit from GNN's better generalization capability. Mowst is easy to optimize
and achieves strong expressive power, with a computation cost comparable to a
single GNN. Empirically, Mowst on 4 backbone GNN architectures show significant
accuracy improvement on 6 standard node classification benchmarks, including
both homophilous and heterophilous graphs
(https://github.com/facebookresearch/mowst-gnn).</abstract><doi>10.48550/arxiv.2311.05185</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2311.05185 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2311_05185 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Learning |
title | Mixture of Weak & Strong Experts on Graphs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T13%3A14%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mixture%20of%20Weak%20&%20Strong%20Experts%20on%20Graphs&rft.au=Zeng,%20Hanqing&rft.date=2023-11-09&rft_id=info:doi/10.48550/arxiv.2311.05185&rft_dat=%3Carxiv_GOX%3E2311_05185%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |