Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving

Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ASME journal of autonomous vehicles and systems (Print) 2021-10, Vol.1 (4)
Hauptverfasser:	Li, Fangjian, Wagner, John, Wang, Yue
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	4
container_start_page
container_title	ASME journal of autonomous vehicles and systems (Print)
container_volume	1
creator	Li, Fangjian Wagner, John Wang, Yue
description	Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.
doi_str_mv	10.1115/1.4053427
format	Article
fullrecord	<record><control><sourceid>asme_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1115_1_4053427</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1131059</sourcerecordid><originalsourceid>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</originalsourceid><addsrcrecordid>eNo9kEtrwzAQhEVpoSHNofcedO3B6a5kRdbRpI8EAoU2lN6MbK9Sh1guUh7k39choacddj6GYRi7RxgjonrCcQpKpkJfsYGYGEg0yOz6X4vvWzaKcQ0AQqGUqAfs69M62h6T_GAD8bzeU4g2NHbD5_6kiX9Q410XKmrJb_mCbPCNX_H-xWfN6udgjzzfbTvftd0u8ufQ7Hv7jt04u4k0utwhW76-LKezZPH-Np_mi8QapRPprCjVpNaYGVORA1WnFZRGKEN1WVIpAZzQSFpRmunKZEYCTWTtLKYCQQ7Z4zm2Cl2MgVzxG5rWhmOBUJwmKbC4TNKzD2fWxpaKdbcLvm_WUxJBGfkHa9dc3g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><source>ASME Transactions Journals (Current)</source><creator>Li, Fangjian ; Wagner, John ; Wang, Yue</creator><creatorcontrib>Li, Fangjian ; Wagner, John ; Wang, Yue</creatorcontrib><description>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</description><identifier>ISSN: 2690-702X</identifier><identifier>EISSN: 2690-7038</identifier><identifier>DOI: 10.1115/1.4053427</identifier><language>eng</language><publisher>ASME</publisher><ispartof>ASME journal of autonomous vehicles and systems (Print), 2021-10, Vol.1 (4)</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</citedby><cites>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,38520</link.rule.ids></links><search><creatorcontrib>Li, Fangjian</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Wang, Yue</creatorcontrib><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><title>ASME journal of autonomous vehicles and systems (Print)</title><addtitle>J. Auton. Veh. Sys</addtitle><description>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</description><issn>2690-702X</issn><issn>2690-7038</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9kEtrwzAQhEVpoSHNofcedO3B6a5kRdbRpI8EAoU2lN6MbK9Sh1guUh7k39choacddj6GYRi7RxgjonrCcQpKpkJfsYGYGEg0yOz6X4vvWzaKcQ0AQqGUqAfs69M62h6T_GAD8bzeU4g2NHbD5_6kiX9Q410XKmrJb_mCbPCNX_H-xWfN6udgjzzfbTvftd0u8ufQ7Hv7jt04u4k0utwhW76-LKezZPH-Np_mi8QapRPprCjVpNaYGVORA1WnFZRGKEN1WVIpAZzQSFpRmunKZEYCTWTtLKYCQQ7Z4zm2Cl2MgVzxG5rWhmOBUJwmKbC4TNKzD2fWxpaKdbcLvm_WUxJBGfkHa9dc3g</recordid><startdate>20211001</startdate><enddate>20211001</enddate><creator>Li, Fangjian</creator><creator>Wagner, John</creator><creator>Wang, Yue</creator><general>ASME</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20211001</creationdate><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><author>Li, Fangjian ; Wagner, John ; Wang, Yue</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Fangjian</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Wang, Yue</creatorcontrib><collection>CrossRef</collection><jtitle>ASME journal of autonomous vehicles and systems (Print)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Fangjian</au><au>Wagner, John</au><au>Wang, Yue</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</atitle><jtitle>ASME journal of autonomous vehicles and systems (Print)</jtitle><stitle>J. Auton. Veh. Sys</stitle><date>2021-10-01</date><risdate>2021</risdate><volume>1</volume><issue>4</issue><issn>2690-702X</issn><eissn>2690-7038</eissn><abstract>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</abstract><pub>ASME</pub><doi>10.1115/1.4053427</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 2690-702X
ispartof	ASME journal of autonomous vehicles and systems (Print), 2021-10, Vol.1 (4)
issn	2690-702X 2690-7038
language	eng
recordid	cdi_crossref_primary_10_1115_1_4053427
source	ASME Transactions Journals (Current)
title	Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T15%3A51%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-asme_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Safety-Aware%20Adversarial%20Inverse%20Reinforcement%20Learning%20for%20Highway%20Autonomous%20Driving&rft.jtitle=ASME%20journal%20of%20autonomous%20vehicles%20and%20systems%20(Print)&rft.au=Li,%20Fangjian&rft.date=2021-10-01&rft.volume=1&rft.issue=4&rft.issn=2690-702X&rft.eissn=2690-7038&rft_id=info:doi/10.1115/1.4053427&rft_dat=%3Casme_cross%3E1131059%3C/asme_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true