Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms
Multi-Agent Reinforcement Learning (MARL) has shown promising results across several domains. Despite this promise, MARL policies often lack robustness and are therefore sensitive to small changes in their environment. This presents a serious concern for the real world deployment of MARL algorithms,...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Bukharin, Alexander Li, Yan Yu, Yue Zhang, Qingru Chen, Zhehui Zuo, Simiao Zhang, Chao Zhang, Songan Zhao, Tuo |
description | Multi-Agent Reinforcement Learning (MARL) has shown promising results across
several domains. Despite this promise, MARL policies often lack robustness and
are therefore sensitive to small changes in their environment. This presents a
serious concern for the real world deployment of MARL algorithms, where the
testing environment may slightly differ from the training environment. In this
work we show that we can gain robustness by controlling a policy's Lipschitz
constant, and under mild conditions, establish the existence of a Lipschitz and
close-to-optimal policy. Based on these insights, we propose a new robust MARL
framework, ERNIE, that promotes the Lipschitz continuity of the policies with
respect to the state observations and actions by adversarial regularization.
The ERNIE framework provides robustness against noisy observations, changing
transition dynamics, and malicious actions of agents. However, ERNIE's
adversarial regularization may introduce some training instability. To reduce
this instability, we reformulate adversarial regularization as a Stackelberg
game. We demonstrate the effectiveness of the proposed framework with extensive
experiments in traffic light control and particle environments. In addition, we
extend ERNIE to mean-field MARL with a formulation based on distributionally
robust optimization that outperforms its non-robust counterpart and is of
independent interest. Our code is available at
https://github.com/abukharin3/ERNIE. |
doi_str_mv | 10.48550/arxiv.2310.10810 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_10810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_10810</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-190455710aa8bfba1ba3a31e2fff3967aff098c8cf463452ed5e9a3e2c8545d03</originalsourceid><addsrcrecordid>eNotkEFPhDAQhbl4MKs_wJP9A6wtpVC8kY2rJhiTlTsZYMo2Ka0phai_XhY9vXlvJpO8L4ruGN2nUgj6AP5LL_uErwGjktHraDm5dp4CeZtN0HE5oA3khNoq5zscL65C8FbbgSwaSNkv6CfwGsx6NsxmHX8gaGcfSX1G5zHobt0d3Wz7LSdge_IRoDVISjM4r8N5nG6iKwVmwtt_3UX18ak-vMTV-_ProaxiyHIas4KmQuSMAshWtcBa4MAZJkopXmQ5KEUL2clOpRlPRYK9wAI4Jp0Uqegp30X3f2-34s2n1yP47-YCoNkA8F_mVliq</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms</title><source>arXiv.org</source><creator>Bukharin, Alexander ; Li, Yan ; Yu, Yue ; Zhang, Qingru ; Chen, Zhehui ; Zuo, Simiao ; Zhang, Chao ; Zhang, Songan ; Zhao, Tuo</creator><creatorcontrib>Bukharin, Alexander ; Li, Yan ; Yu, Yue ; Zhang, Qingru ; Chen, Zhehui ; Zuo, Simiao ; Zhang, Chao ; Zhang, Songan ; Zhao, Tuo</creatorcontrib><description>Multi-Agent Reinforcement Learning (MARL) has shown promising results across
several domains. Despite this promise, MARL policies often lack robustness and
are therefore sensitive to small changes in their environment. This presents a
serious concern for the real world deployment of MARL algorithms, where the
testing environment may slightly differ from the training environment. In this
work we show that we can gain robustness by controlling a policy's Lipschitz
constant, and under mild conditions, establish the existence of a Lipschitz and
close-to-optimal policy. Based on these insights, we propose a new robust MARL
framework, ERNIE, that promotes the Lipschitz continuity of the policies with
respect to the state observations and actions by adversarial regularization.
The ERNIE framework provides robustness against noisy observations, changing
transition dynamics, and malicious actions of agents. However, ERNIE's
adversarial regularization may introduce some training instability. To reduce
this instability, we reformulate adversarial regularization as a Stackelberg
game. We demonstrate the effectiveness of the proposed framework with extensive
experiments in traffic light control and particle environments. In addition, we
extend ERNIE to mean-field MARL with a formulation based on distributionally
robust optimization that outperforms its non-robust counterpart and is of
independent interest. Our code is available at
https://github.com/abukharin3/ERNIE.</description><identifier>DOI: 10.48550/arxiv.2310.10810</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2023-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.10810$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.10810$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bukharin, Alexander</creatorcontrib><creatorcontrib>Li, Yan</creatorcontrib><creatorcontrib>Yu, Yue</creatorcontrib><creatorcontrib>Zhang, Qingru</creatorcontrib><creatorcontrib>Chen, Zhehui</creatorcontrib><creatorcontrib>Zuo, Simiao</creatorcontrib><creatorcontrib>Zhang, Chao</creatorcontrib><creatorcontrib>Zhang, Songan</creatorcontrib><creatorcontrib>Zhao, Tuo</creatorcontrib><title>Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms</title><description>Multi-Agent Reinforcement Learning (MARL) has shown promising results across
several domains. Despite this promise, MARL policies often lack robustness and
are therefore sensitive to small changes in their environment. This presents a
serious concern for the real world deployment of MARL algorithms, where the
testing environment may slightly differ from the training environment. In this
work we show that we can gain robustness by controlling a policy's Lipschitz
constant, and under mild conditions, establish the existence of a Lipschitz and
close-to-optimal policy. Based on these insights, we propose a new robust MARL
framework, ERNIE, that promotes the Lipschitz continuity of the policies with
respect to the state observations and actions by adversarial regularization.
The ERNIE framework provides robustness against noisy observations, changing
transition dynamics, and malicious actions of agents. However, ERNIE's
adversarial regularization may introduce some training instability. To reduce
this instability, we reformulate adversarial regularization as a Stackelberg
game. We demonstrate the effectiveness of the proposed framework with extensive
experiments in traffic light control and particle environments. In addition, we
extend ERNIE to mean-field MARL with a formulation based on distributionally
robust optimization that outperforms its non-robust counterpart and is of
independent interest. Our code is available at
https://github.com/abukharin3/ERNIE.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkEFPhDAQhbl4MKs_wJP9A6wtpVC8kY2rJhiTlTsZYMo2Ka0phai_XhY9vXlvJpO8L4ruGN2nUgj6AP5LL_uErwGjktHraDm5dp4CeZtN0HE5oA3khNoq5zscL65C8FbbgSwaSNkv6CfwGsx6NsxmHX8gaGcfSX1G5zHobt0d3Wz7LSdge_IRoDVISjM4r8N5nG6iKwVmwtt_3UX18ak-vMTV-_ProaxiyHIas4KmQuSMAshWtcBa4MAZJkopXmQ5KEUL2clOpRlPRYK9wAI4Jp0Uqegp30X3f2-34s2n1yP47-YCoNkA8F_mVliq</recordid><startdate>20231016</startdate><enddate>20231016</enddate><creator>Bukharin, Alexander</creator><creator>Li, Yan</creator><creator>Yu, Yue</creator><creator>Zhang, Qingru</creator><creator>Chen, Zhehui</creator><creator>Zuo, Simiao</creator><creator>Zhang, Chao</creator><creator>Zhang, Songan</creator><creator>Zhao, Tuo</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231016</creationdate><title>Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms</title><author>Bukharin, Alexander ; Li, Yan ; Yu, Yue ; Zhang, Qingru ; Chen, Zhehui ; Zuo, Simiao ; Zhang, Chao ; Zhang, Songan ; Zhao, Tuo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-190455710aa8bfba1ba3a31e2fff3967aff098c8cf463452ed5e9a3e2c8545d03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Bukharin, Alexander</creatorcontrib><creatorcontrib>Li, Yan</creatorcontrib><creatorcontrib>Yu, Yue</creatorcontrib><creatorcontrib>Zhang, Qingru</creatorcontrib><creatorcontrib>Chen, Zhehui</creatorcontrib><creatorcontrib>Zuo, Simiao</creatorcontrib><creatorcontrib>Zhang, Chao</creatorcontrib><creatorcontrib>Zhang, Songan</creatorcontrib><creatorcontrib>Zhao, Tuo</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bukharin, Alexander</au><au>Li, Yan</au><au>Yu, Yue</au><au>Zhang, Qingru</au><au>Chen, Zhehui</au><au>Zuo, Simiao</au><au>Zhang, Chao</au><au>Zhang, Songan</au><au>Zhao, Tuo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms</atitle><date>2023-10-16</date><risdate>2023</risdate><abstract>Multi-Agent Reinforcement Learning (MARL) has shown promising results across
several domains. Despite this promise, MARL policies often lack robustness and
are therefore sensitive to small changes in their environment. This presents a
serious concern for the real world deployment of MARL algorithms, where the
testing environment may slightly differ from the training environment. In this
work we show that we can gain robustness by controlling a policy's Lipschitz
constant, and under mild conditions, establish the existence of a Lipschitz and
close-to-optimal policy. Based on these insights, we propose a new robust MARL
framework, ERNIE, that promotes the Lipschitz continuity of the policies with
respect to the state observations and actions by adversarial regularization.
The ERNIE framework provides robustness against noisy observations, changing
transition dynamics, and malicious actions of agents. However, ERNIE's
adversarial regularization may introduce some training instability. To reduce
this instability, we reformulate adversarial regularization as a Stackelberg
game. We demonstrate the effectiveness of the proposed framework with extensive
experiments in traffic light control and particle environments. In addition, we
extend ERNIE to mean-field MARL with a formulation based on distributionally
robust optimization that outperforms its non-robust counterpart and is of
independent interest. Our code is available at
https://github.com/abukharin3/ERNIE.</abstract><doi>10.48550/arxiv.2310.10810</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2310.10810 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2310_10810 |
source | arXiv.org |
subjects | Computer Science - Learning |
title | Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T00%3A02%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Multi-Agent%20Reinforcement%20Learning%20via%20Adversarial%20Regularization:%20Theoretical%20Foundation%20and%20Stable%20Algorithms&rft.au=Bukharin,%20Alexander&rft.date=2023-10-16&rft_id=info:doi/10.48550/arxiv.2310.10810&rft_dat=%3Carxiv_GOX%3E2310_10810%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |