Effect-Invariant Mechanisms for Policy Generalization

Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to un...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of machine learning research 2024, Vol.25
Hauptverfasser:	Saengkyongam, Sorawit, Pfister, Niklas, Klasnja, Predrag, Murphy, Susan, Peters, Jonas
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	Journal of machine learning research
container_volume	25
creator	Saengkyongam, Sorawit Pfister, Niklas Klasnja, Predrag Murphy, Susan Peters, Jonas
description	Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11286230</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3086383025</sourcerecordid><originalsourceid>FETCH-LOGICAL-p197t-ca1ddd59da3ff0bda2ffc4a8fffe2623415198eedc71bd00ccbc93c7170bf6b23</originalsourceid><addsrcrecordid>eNpVkE1Lw0AQhhdRbK3-BcnRS2A_ssnmJFJqW6joQc_LZD_sSrJbs0mh_noXraKnmWFenmeYEzQlnLG8qqk4_eppXhSMT9BFjG8Yk4rT8hxNWI0FxbicIr6w1qghX_s99A78kD0YtQXvYhczG_rsKbROHbKl8aaH1n3A4IK_RGcW2miujnWGXu4Xz_NVvnlcrud3m3xH6mrIFRCtNa81MGtxo4FaqwoQNjlpSVlBOKmFMVpVpNEYK9WomqWhwo0tG8pm6PabuxubLsWMH9IRcte7DvqDDODk_413W_ka9pIQKpIAJ8LNkdCH99HEQXYuKtO24E0Yo2RYlEwwTHmKXv-V_Vp-nsU-AQiJaRM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3086383025</pqid></control><display><type>article</type><title>Effect-Invariant Mechanisms for Policy Generalization</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Saengkyongam, Sorawit ; Pfister, Niklas ; Klasnja, Predrag ; Murphy, Susan ; Peters, Jonas</creator><creatorcontrib>Saengkyongam, Sorawit ; Pfister, Niklas ; Klasnja, Predrag ; Murphy, Susan ; Peters, Jonas</creatorcontrib><description>Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.</description><identifier>ISSN: 1532-4435</identifier><identifier>EISSN: 1533-7928</identifier><identifier>PMID: 39082006</identifier><language>eng</language><publisher>United States</publisher><ispartof>Journal of machine learning research, 2024, Vol.25</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,4010</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39082006$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Saengkyongam, Sorawit</creatorcontrib><creatorcontrib>Pfister, Niklas</creatorcontrib><creatorcontrib>Klasnja, Predrag</creatorcontrib><creatorcontrib>Murphy, Susan</creatorcontrib><creatorcontrib>Peters, Jonas</creatorcontrib><title>Effect-Invariant Mechanisms for Policy Generalization</title><title>Journal of machine learning research</title><addtitle>J Mach Learn Res</addtitle><description>Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.</description><issn>1532-4435</issn><issn>1533-7928</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpVkE1Lw0AQhhdRbK3-BcnRS2A_ssnmJFJqW6joQc_LZD_sSrJbs0mh_noXraKnmWFenmeYEzQlnLG8qqk4_eppXhSMT9BFjG8Yk4rT8hxNWI0FxbicIr6w1qghX_s99A78kD0YtQXvYhczG_rsKbROHbKl8aaH1n3A4IK_RGcW2miujnWGXu4Xz_NVvnlcrud3m3xH6mrIFRCtNa81MGtxo4FaqwoQNjlpSVlBOKmFMVpVpNEYK9WomqWhwo0tG8pm6PabuxubLsWMH9IRcte7DvqDDODk_413W_ka9pIQKpIAJ8LNkdCH99HEQXYuKtO24E0Yo2RYlEwwTHmKXv-V_Vp-nsU-AQiJaRM</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Saengkyongam, Sorawit</creator><creator>Pfister, Niklas</creator><creator>Klasnja, Predrag</creator><creator>Murphy, Susan</creator><creator>Peters, Jonas</creator><scope>NPM</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>2024</creationdate><title>Effect-Invariant Mechanisms for Policy Generalization</title><author>Saengkyongam, Sorawit ; Pfister, Niklas ; Klasnja, Predrag ; Murphy, Susan ; Peters, Jonas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p197t-ca1ddd59da3ff0bda2ffc4a8fffe2623415198eedc71bd00ccbc93c7170bf6b23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saengkyongam, Sorawit</creatorcontrib><creatorcontrib>Pfister, Niklas</creatorcontrib><creatorcontrib>Klasnja, Predrag</creatorcontrib><creatorcontrib>Murphy, Susan</creatorcontrib><creatorcontrib>Peters, Jonas</creatorcontrib><collection>PubMed</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of machine learning research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saengkyongam, Sorawit</au><au>Pfister, Niklas</au><au>Klasnja, Predrag</au><au>Murphy, Susan</au><au>Peters, Jonas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Effect-Invariant Mechanisms for Policy Generalization</atitle><jtitle>Journal of machine learning research</jtitle><addtitle>J Mach Learn Res</addtitle><date>2024</date><risdate>2024</risdate><volume>25</volume><issn>1532-4435</issn><eissn>1533-7928</eissn><abstract>Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.</abstract><cop>United States</cop><pmid>39082006</pmid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1532-4435
ispartof	Journal of machine learning research, 2024, Vol.25
issn	1532-4435 1533-7928
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11286230
source	ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals
title	Effect-Invariant Mechanisms for Policy Generalization
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T08%3A10%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Effect-Invariant%20Mechanisms%20for%20Policy%20Generalization&rft.jtitle=Journal%20of%20machine%20learning%20research&rft.au=Saengkyongam,%20Sorawit&rft.date=2024&rft.volume=25&rft.issn=1532-4435&rft.eissn=1533-7928&rft_id=info:doi/&rft_dat=%3Cproquest_pubme%3E3086383025%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3086383025&rft_id=info:pmid/39082006&rfr_iscdi=true