InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning
Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Bejnordi, Babak Ehteshami Kumar, Gaurav Royer, Amelie Louizos, Christos Blankevoort, Tijmen Ghafoorian, Mohsen |
description | Jointly learning multiple tasks with a unified model can improve accuracy and
data efficiency, but it faces the challenge of task interference, where
optimizing one task objective may inadvertently compromise the performance of
another. A solution to mitigate this issue is to allocate task-specific
parameters, free from interference, on top of shared features. However,
manually designing such architectures is cumbersome, as practitioners need to
balance between the overall performance across all tasks and the higher
computational cost induced by the newly added parameters. In this work, we
propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture
designed to mitigate task interference while optimizing inference computational
efficiency. We employ a learnable gating mechanism to automatically balance the
shared and task-specific representations while preserving the performance of
all tasks. Crucially, the patterns of parameter sharing and specialization
dynamically learned during training, become fixed at inference, resulting in a
static, optimized MTL architecture. Through extensive empirical evaluations, we
demonstrate SoTA results on three MTL benchmarks using convolutional as well as
transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context. |
doi_str_mv | 10.48550/arxiv.2402.16848 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_16848</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_16848</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-d89e05b8648fc7bfe9aafa344f686bc483305c253c220eddb46d4ac9273a94513</originalsourceid><addsrcrecordid>eNo9z71OwzAYhWEvDKhwAUz4Akhw_BeHDVVQKgWBaFcUfbE_g0VwIsdFwNUDBTGdM73SQ8hJxUpplGLnkN7DW8kl42WljTSH5HEdM6Y0riDjBW0RUgzxieaRbp4h4RndTGgDDOHz-0N09D7tItIHnBLOGDPkMMaZ-jHR292QQ5FhfvnvHJEDD8OMx3-7INvrq-3ypmjvVuvlZVuArk3hTINM9UZL423de2wAPAgpvTa6t9IIwZTlSljOGTrXS-0k2IbXAhqpKrEgp7_Zva-bUniF9NH9OLu9U3wBrqpOUw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><source>arXiv.org</source><creator>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</creator><creatorcontrib>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</creatorcontrib><description>Jointly learning multiple tasks with a unified model can improve accuracy and
data efficiency, but it faces the challenge of task interference, where
optimizing one task objective may inadvertently compromise the performance of
another. A solution to mitigate this issue is to allocate task-specific
parameters, free from interference, on top of shared features. However,
manually designing such architectures is cumbersome, as practitioners need to
balance between the overall performance across all tasks and the higher
computational cost induced by the newly added parameters. In this work, we
propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture
designed to mitigate task interference while optimizing inference computational
efficiency. We employ a learnable gating mechanism to automatically balance the
shared and task-specific representations while preserving the performance of
all tasks. Crucially, the patterns of parameter sharing and specialization
dynamically learned during training, become fixed at inference, resulting in a
static, optimized MTL architecture. Through extensive empirical evaluations, we
demonstrate SoTA results on three MTL benchmarks using convolutional as well as
transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</description><identifier>DOI: 10.48550/arxiv.2402.16848</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.16848$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.16848$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bejnordi, Babak Ehteshami</creatorcontrib><creatorcontrib>Kumar, Gaurav</creatorcontrib><creatorcontrib>Royer, Amelie</creatorcontrib><creatorcontrib>Louizos, Christos</creatorcontrib><creatorcontrib>Blankevoort, Tijmen</creatorcontrib><creatorcontrib>Ghafoorian, Mohsen</creatorcontrib><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><description>Jointly learning multiple tasks with a unified model can improve accuracy and
data efficiency, but it faces the challenge of task interference, where
optimizing one task objective may inadvertently compromise the performance of
another. A solution to mitigate this issue is to allocate task-specific
parameters, free from interference, on top of shared features. However,
manually designing such architectures is cumbersome, as practitioners need to
balance between the overall performance across all tasks and the higher
computational cost induced by the newly added parameters. In this work, we
propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture
designed to mitigate task interference while optimizing inference computational
efficiency. We employ a learnable gating mechanism to automatically balance the
shared and task-specific representations while preserving the performance of
all tasks. Crucially, the patterns of parameter sharing and specialization
dynamically learned during training, become fixed at inference, resulting in a
static, optimized MTL architecture. Through extensive empirical evaluations, we
demonstrate SoTA results on three MTL benchmarks using convolutional as well as
transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9z71OwzAYhWEvDKhwAUz4Akhw_BeHDVVQKgWBaFcUfbE_g0VwIsdFwNUDBTGdM73SQ8hJxUpplGLnkN7DW8kl42WljTSH5HEdM6Y0riDjBW0RUgzxieaRbp4h4RndTGgDDOHz-0N09D7tItIHnBLOGDPkMMaZ-jHR292QQ5FhfvnvHJEDD8OMx3-7INvrq-3ypmjvVuvlZVuArk3hTINM9UZL423de2wAPAgpvTa6t9IIwZTlSljOGTrXS-0k2IbXAhqpKrEgp7_Zva-bUniF9NH9OLu9U3wBrqpOUw</recordid><startdate>20240226</startdate><enddate>20240226</enddate><creator>Bejnordi, Babak Ehteshami</creator><creator>Kumar, Gaurav</creator><creator>Royer, Amelie</creator><creator>Louizos, Christos</creator><creator>Blankevoort, Tijmen</creator><creator>Ghafoorian, Mohsen</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240226</creationdate><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><author>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-d89e05b8648fc7bfe9aafa344f686bc483305c253c220eddb46d4ac9273a94513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bejnordi, Babak Ehteshami</creatorcontrib><creatorcontrib>Kumar, Gaurav</creatorcontrib><creatorcontrib>Royer, Amelie</creatorcontrib><creatorcontrib>Louizos, Christos</creatorcontrib><creatorcontrib>Blankevoort, Tijmen</creatorcontrib><creatorcontrib>Ghafoorian, Mohsen</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bejnordi, Babak Ehteshami</au><au>Kumar, Gaurav</au><au>Royer, Amelie</au><au>Louizos, Christos</au><au>Blankevoort, Tijmen</au><au>Ghafoorian, Mohsen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</atitle><date>2024-02-26</date><risdate>2024</risdate><abstract>Jointly learning multiple tasks with a unified model can improve accuracy and
data efficiency, but it faces the challenge of task interference, where
optimizing one task objective may inadvertently compromise the performance of
another. A solution to mitigate this issue is to allocate task-specific
parameters, free from interference, on top of shared features. However,
manually designing such architectures is cumbersome, as practitioners need to
balance between the overall performance across all tasks and the higher
computational cost induced by the newly added parameters. In this work, we
propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture
designed to mitigate task interference while optimizing inference computational
efficiency. We employ a learnable gating mechanism to automatically balance the
shared and task-specific representations while preserving the performance of
all tasks. Crucially, the patterns of parameter sharing and specialization
dynamically learned during training, become fixed at inference, resulting in a
static, optimized MTL architecture. Through extensive empirical evaluations, we
demonstrate SoTA results on three MTL benchmarks using convolutional as well as
transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</abstract><doi>10.48550/arxiv.2402.16848</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2402.16848 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2402_16848 |
source | arXiv.org |
subjects | Computer Science - Learning |
title | InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T00%3A51%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=InterroGate:%20Learning%20to%20Share,%20Specialize,%20and%20Prune%20Representations%20for%20Multi-task%20Learning&rft.au=Bejnordi,%20Babak%20Ehteshami&rft.date=2024-02-26&rft_id=info:doi/10.48550/arxiv.2402.16848&rft_dat=%3Carxiv_GOX%3E2402_16848%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |