InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning

Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bejnordi, Babak Ehteshami, Kumar, Gaurav, Royer, Amelie, Louizos, Christos, Blankevoort, Tijmen, Ghafoorian, Mohsen
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Bejnordi, Babak Ehteshami Kumar, Gaurav Royer, Amelie Louizos, Christos Blankevoort, Tijmen Ghafoorian, Mohsen
description	Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific parameters, free from interference, on top of shared features. However, manually designing such architectures is cumbersome, as practitioners need to balance between the overall performance across all tasks and the higher computational cost induced by the newly added parameters. In this work, we propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture designed to mitigate task interference while optimizing inference computational efficiency. We employ a learnable gating mechanism to automatically balance the shared and task-specific representations while preserving the performance of all tasks. Crucially, the patterns of parameter sharing and specialization dynamically learned during training, become fixed at inference, resulting in a static, optimized MTL architecture. Through extensive empirical evaluations, we demonstrate SoTA results on three MTL benchmarks using convolutional as well as transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.
doi_str_mv	10.48550/arxiv.2402.16848
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_16848</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_16848</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-d89e05b8648fc7bfe9aafa344f686bc483305c253c220eddb46d4ac9273a94513</originalsourceid><addsrcrecordid>eNo9z71OwzAYhWEvDKhwAUz4Akhw_BeHDVVQKgWBaFcUfbE_g0VwIsdFwNUDBTGdM73SQ8hJxUpplGLnkN7DW8kl42WljTSH5HEdM6Y0riDjBW0RUgzxieaRbp4h4RndTGgDDOHz-0N09D7tItIHnBLOGDPkMMaZ-jHR292QQ5FhfvnvHJEDD8OMx3-7INvrq-3ypmjvVuvlZVuArk3hTINM9UZL423de2wAPAgpvTa6t9IIwZTlSljOGTrXS-0k2IbXAhqpKrEgp7_Zva-bUniF9NH9OLu9U3wBrqpOUw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><source>arXiv.org</source><creator>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</creator><creatorcontrib>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</creatorcontrib><description>Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific parameters, free from interference, on top of shared features. However, manually designing such architectures is cumbersome, as practitioners need to balance between the overall performance across all tasks and the higher computational cost induced by the newly added parameters. In this work, we propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture designed to mitigate task interference while optimizing inference computational efficiency. We employ a learnable gating mechanism to automatically balance the shared and task-specific representations while preserving the performance of all tasks. Crucially, the patterns of parameter sharing and specialization dynamically learned during training, become fixed at inference, resulting in a static, optimized MTL architecture. Through extensive empirical evaluations, we demonstrate SoTA results on three MTL benchmarks using convolutional as well as transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</description><identifier>DOI: 10.48550/arxiv.2402.16848</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-02</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.16848$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.16848$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bejnordi, Babak Ehteshami</creatorcontrib><creatorcontrib>Kumar, Gaurav</creatorcontrib><creatorcontrib>Royer, Amelie</creatorcontrib><creatorcontrib>Louizos, Christos</creatorcontrib><creatorcontrib>Blankevoort, Tijmen</creatorcontrib><creatorcontrib>Ghafoorian, Mohsen</creatorcontrib><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><description>Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific parameters, free from interference, on top of shared features. However, manually designing such architectures is cumbersome, as practitioners need to balance between the overall performance across all tasks and the higher computational cost induced by the newly added parameters. In this work, we propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture designed to mitigate task interference while optimizing inference computational efficiency. We employ a learnable gating mechanism to automatically balance the shared and task-specific representations while preserving the performance of all tasks. Crucially, the patterns of parameter sharing and specialization dynamically learned during training, become fixed at inference, resulting in a static, optimized MTL architecture. Through extensive empirical evaluations, we demonstrate SoTA results on three MTL benchmarks using convolutional as well as transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNo9z71OwzAYhWEvDKhwAUz4Akhw_BeHDVVQKgWBaFcUfbE_g0VwIsdFwNUDBTGdM73SQ8hJxUpplGLnkN7DW8kl42WljTSH5HEdM6Y0riDjBW0RUgzxieaRbp4h4RndTGgDDOHz-0N09D7tItIHnBLOGDPkMMaZ-jHR292QQ5FhfvnvHJEDD8OMx3-7INvrq-3ypmjvVuvlZVuArk3hTINM9UZL423de2wAPAgpvTa6t9IIwZTlSljOGTrXS-0k2IbXAhqpKrEgp7_Zva-bUniF9NH9OLu9U3wBrqpOUw</recordid><startdate>20240226</startdate><enddate>20240226</enddate><creator>Bejnordi, Babak Ehteshami</creator><creator>Kumar, Gaurav</creator><creator>Royer, Amelie</creator><creator>Louizos, Christos</creator><creator>Blankevoort, Tijmen</creator><creator>Ghafoorian, Mohsen</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240226</creationdate><title>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</title><author>Bejnordi, Babak Ehteshami ; Kumar, Gaurav ; Royer, Amelie ; Louizos, Christos ; Blankevoort, Tijmen ; Ghafoorian, Mohsen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-d89e05b8648fc7bfe9aafa344f686bc483305c253c220eddb46d4ac9273a94513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bejnordi, Babak Ehteshami</creatorcontrib><creatorcontrib>Kumar, Gaurav</creatorcontrib><creatorcontrib>Royer, Amelie</creatorcontrib><creatorcontrib>Louizos, Christos</creatorcontrib><creatorcontrib>Blankevoort, Tijmen</creatorcontrib><creatorcontrib>Ghafoorian, Mohsen</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bejnordi, Babak Ehteshami</au><au>Kumar, Gaurav</au><au>Royer, Amelie</au><au>Louizos, Christos</au><au>Blankevoort, Tijmen</au><au>Ghafoorian, Mohsen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning</atitle><date>2024-02-26</date><risdate>2024</risdate><abstract>Jointly learning multiple tasks with a unified model can improve accuracy and data efficiency, but it faces the challenge of task interference, where optimizing one task objective may inadvertently compromise the performance of another. A solution to mitigate this issue is to allocate task-specific parameters, free from interference, on top of shared features. However, manually designing such architectures is cumbersome, as practitioners need to balance between the overall performance across all tasks and the higher computational cost induced by the newly added parameters. In this work, we propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture designed to mitigate task interference while optimizing inference computational efficiency. We employ a learnable gating mechanism to automatically balance the shared and task-specific representations while preserving the performance of all tasks. Crucially, the patterns of parameter sharing and specialization dynamically learned during training, become fixed at inference, resulting in a static, optimized MTL architecture. Through extensive empirical evaluations, we demonstrate SoTA results on three MTL benchmarks using convolutional as well as transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.</abstract><doi>10.48550/arxiv.2402.16848</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2402.16848
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2402_16848
source	arXiv.org
subjects	Computer Science - Learning
title	InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T00%3A51%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=InterroGate:%20Learning%20to%20Share,%20Specialize,%20and%20Prune%20Representations%20for%20Multi-task%20Learning&rft.au=Bejnordi,%20Babak%20Ehteshami&rft.date=2024-02-26&rft_id=info:doi/10.48550/arxiv.2402.16848&rft_dat=%3Carxiv_GOX%3E2402_16848%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true