Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-12
Hauptverfasser: Tan, Daniel C H, Acero, Fernando, McCarthy, Robert, Kanoulas, Dimitrios, Li, Zhibin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Tan, Daniel C H
Acero, Fernando
McCarthy, Robert
Kanoulas, Dimitrios
Li, Zhibin
description Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2823796625</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2823796625</sourcerecordid><originalsourceid>FETCH-proquest_journals_28237966253</originalsourceid><addsrcrecordid>eNqNi8sKwjAQRYMgWLT_MOC6UCf2oUuLxaVgcVtCmWhKSXTSLPx7FUS3rg6Xc89ERCjlKinXiDMRe9-naYp5gVkmI6HOaggEdbDdaJz1oJigcnZkN8BOMRvin93Cmdho06n3BKfhpDTB0Q2mM-QheGMv37y5kuPHQky1GjzFH87Fst431SG5sbsH8mPbu8D2pVosURabPMdM_vd6Ai3_RUQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2823796625</pqid></control><display><type>article</type><title>Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory</title><source>Free E- Journals</source><creator>Tan, Daniel C H ; Acero, Fernando ; McCarthy, Robert ; Kanoulas, Dimitrios ; Li, Zhibin</creator><creatorcontrib>Tan, Daniel C H ; Acero, Fernando ; McCarthy, Robert ; Kanoulas, Dimitrios ; Li, Zhibin</creatorcontrib><description>Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Control systems design ; Control tasks ; Control theory ; Learning ; Policies ; Safety critical ; Verification</subject><ispartof>arXiv.org, 2023-12</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Tan, Daniel C H</creatorcontrib><creatorcontrib>Acero, Fernando</creatorcontrib><creatorcontrib>McCarthy, Robert</creatorcontrib><creatorcontrib>Kanoulas, Dimitrios</creatorcontrib><creatorcontrib>Li, Zhibin</creatorcontrib><title>Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory</title><title>arXiv.org</title><description>Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/</description><subject>Control systems design</subject><subject>Control tasks</subject><subject>Control theory</subject><subject>Learning</subject><subject>Policies</subject><subject>Safety critical</subject><subject>Verification</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi8sKwjAQRYMgWLT_MOC6UCf2oUuLxaVgcVtCmWhKSXTSLPx7FUS3rg6Xc89ERCjlKinXiDMRe9-naYp5gVkmI6HOaggEdbDdaJz1oJigcnZkN8BOMRvin93Cmdho06n3BKfhpDTB0Q2mM-QheGMv37y5kuPHQky1GjzFH87Fst431SG5sbsH8mPbu8D2pVosURabPMdM_vd6Ai3_RUQ</recordid><startdate>20231205</startdate><enddate>20231205</enddate><creator>Tan, Daniel C H</creator><creator>Acero, Fernando</creator><creator>McCarthy, Robert</creator><creator>Kanoulas, Dimitrios</creator><creator>Li, Zhibin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231205</creationdate><title>Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory</title><author>Tan, Daniel C H ; Acero, Fernando ; McCarthy, Robert ; Kanoulas, Dimitrios ; Li, Zhibin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28237966253</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Control systems design</topic><topic>Control tasks</topic><topic>Control theory</topic><topic>Learning</topic><topic>Policies</topic><topic>Safety critical</topic><topic>Verification</topic><toplevel>online_resources</toplevel><creatorcontrib>Tan, Daniel C H</creatorcontrib><creatorcontrib>Acero, Fernando</creatorcontrib><creatorcontrib>McCarthy, Robert</creatorcontrib><creatorcontrib>Kanoulas, Dimitrios</creatorcontrib><creatorcontrib>Li, Zhibin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tan, Daniel C H</au><au>Acero, Fernando</au><au>McCarthy, Robert</au><au>Kanoulas, Dimitrios</au><au>Li, Zhibin</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory</atitle><jtitle>arXiv.org</jtitle><date>2023-12-05</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_2823796625
source Free E- Journals
subjects Control systems design
Control tasks
Control theory
Learning
Policies
Safety critical
Verification
title Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T09%3A38%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Value%20Functions%20are%20Control%20Barrier%20Functions:%20Verification%20of%20Safe%20Policies%20using%20Control%20Theory&rft.jtitle=arXiv.org&rft.au=Tan,%20Daniel%20C%20H&rft.date=2023-12-05&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2823796625%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2823796625&rft_id=info:pmid/&rfr_iscdi=true