Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving

Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ASME journal of autonomous vehicles and systems (Print) 2021-10, Vol.1 (4)
Hauptverfasser: Li, Fangjian, Wagner, John, Wang, Yue
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 4
container_start_page
container_title ASME journal of autonomous vehicles and systems (Print)
container_volume 1
creator Li, Fangjian
Wagner, John
Wang, Yue
description Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.
doi_str_mv 10.1115/1.4053427
format Article
fullrecord <record><control><sourceid>asme_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1115_1_4053427</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1131059</sourcerecordid><originalsourceid>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</originalsourceid><addsrcrecordid>eNo9kEtrwzAQhEVpoSHNofcedO3B6a5kRdbRpI8EAoU2lN6MbK9Sh1guUh7k39choacddj6GYRi7RxgjonrCcQpKpkJfsYGYGEg0yOz6X4vvWzaKcQ0AQqGUqAfs69M62h6T_GAD8bzeU4g2NHbD5_6kiX9Q410XKmrJb_mCbPCNX_H-xWfN6udgjzzfbTvftd0u8ufQ7Hv7jt04u4k0utwhW76-LKezZPH-Np_mi8QapRPprCjVpNaYGVORA1WnFZRGKEN1WVIpAZzQSFpRmunKZEYCTWTtLKYCQQ7Z4zm2Cl2MgVzxG5rWhmOBUJwmKbC4TNKzD2fWxpaKdbcLvm_WUxJBGfkHa9dc3g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><source>ASME Transactions Journals (Current)</source><creator>Li, Fangjian ; Wagner, John ; Wang, Yue</creator><creatorcontrib>Li, Fangjian ; Wagner, John ; Wang, Yue</creatorcontrib><description>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</description><identifier>ISSN: 2690-702X</identifier><identifier>EISSN: 2690-7038</identifier><identifier>DOI: 10.1115/1.4053427</identifier><language>eng</language><publisher>ASME</publisher><ispartof>ASME journal of autonomous vehicles and systems (Print), 2021-10, Vol.1 (4)</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</citedby><cites>FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,38520</link.rule.ids></links><search><creatorcontrib>Li, Fangjian</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Wang, Yue</creatorcontrib><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><title>ASME journal of autonomous vehicles and systems (Print)</title><addtitle>J. Auton. Veh. Sys</addtitle><description>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</description><issn>2690-702X</issn><issn>2690-7038</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9kEtrwzAQhEVpoSHNofcedO3B6a5kRdbRpI8EAoU2lN6MbK9Sh1guUh7k39choacddj6GYRi7RxgjonrCcQpKpkJfsYGYGEg0yOz6X4vvWzaKcQ0AQqGUqAfs69M62h6T_GAD8bzeU4g2NHbD5_6kiX9Q410XKmrJb_mCbPCNX_H-xWfN6udgjzzfbTvftd0u8ufQ7Hv7jt04u4k0utwhW76-LKezZPH-Np_mi8QapRPprCjVpNaYGVORA1WnFZRGKEN1WVIpAZzQSFpRmunKZEYCTWTtLKYCQQ7Z4zm2Cl2MgVzxG5rWhmOBUJwmKbC4TNKzD2fWxpaKdbcLvm_WUxJBGfkHa9dc3g</recordid><startdate>20211001</startdate><enddate>20211001</enddate><creator>Li, Fangjian</creator><creator>Wagner, John</creator><creator>Wang, Yue</creator><general>ASME</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20211001</creationdate><title>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</title><author>Li, Fangjian ; Wagner, John ; Wang, Yue</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a957-3fa2b56d71899cef05d4c0b9259edbbeb300f271e75e487c98930e63dfa142103</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Fangjian</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Wang, Yue</creatorcontrib><collection>CrossRef</collection><jtitle>ASME journal of autonomous vehicles and systems (Print)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Fangjian</au><au>Wagner, John</au><au>Wang, Yue</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving</atitle><jtitle>ASME journal of autonomous vehicles and systems (Print)</jtitle><stitle>J. Auton. Veh. Sys</stitle><date>2021-10-01</date><risdate>2021</risdate><volume>1</volume><issue>4</issue><issn>2690-702X</issn><eissn>2690-7038</eissn><abstract>Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning (S-AIRL) algorithm. First, the control barrier function is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further enforce the importance of safety, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning performance, the proposed S-AIRL can reduce the collision rate by 32.6%.</abstract><pub>ASME</pub><doi>10.1115/1.4053427</doi></addata></record>
fulltext fulltext
identifier ISSN: 2690-702X
ispartof ASME journal of autonomous vehicles and systems (Print), 2021-10, Vol.1 (4)
issn 2690-702X
2690-7038
language eng
recordid cdi_crossref_primary_10_1115_1_4053427
source ASME Transactions Journals (Current)
title Safety-Aware Adversarial Inverse Reinforcement Learning for Highway Autonomous Driving
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T15%3A51%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-asme_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Safety-Aware%20Adversarial%20Inverse%20Reinforcement%20Learning%20for%20Highway%20Autonomous%20Driving&rft.jtitle=ASME%20journal%20of%20autonomous%20vehicles%20and%20systems%20(Print)&rft.au=Li,%20Fangjian&rft.date=2021-10-01&rft.volume=1&rft.issue=4&rft.issn=2690-702X&rft.eissn=2690-7038&rft_id=info:doi/10.1115/1.4053427&rft_dat=%3Casme_cross%3E1131059%3C/asme_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true