Deep Reinforcement Learning Verification: A Survey
Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optim...
Gespeichert in:
Veröffentlicht in: | ACM computing surveys 2023-07, Vol.55 (14s), p.1-31, Article 330 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 31 |
---|---|
container_issue | 14s |
container_start_page | 1 |
container_title | ACM computing surveys |
container_volume | 55 |
creator | Landers, Matthew Doryab, Afsaneh |
description | Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research. |
doi_str_mv | 10.1145/3596444 |
format | Article |
fullrecord | <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3596444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3596444</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</originalsourceid><addsrcrecordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Reinforcement Learning Verification: A Survey</title><source>ACM Digital Library Complete</source><creator>Landers, Matthew ; Doryab, Afsaneh</creator><creatorcontrib>Landers, Matthew ; Doryab, Afsaneh</creatorcontrib><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><identifier>ISSN: 0360-0300</identifier><identifier>EISSN: 1557-7341</identifier><identifier>DOI: 10.1145/3596444</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; General and reference ; Reinforcement learning ; Surveys and overviews</subject><ispartof>ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330</ispartof><rights>Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</citedby><cites>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</cites><orcidid>0000-0003-2709-3596 ; 0000-0003-1575-385X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3596444$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2276,27903,27904,40175,75974</link.rule.ids></links><search><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><title>Deep Reinforcement Learning Verification: A Survey</title><title>ACM computing surveys</title><addtitle>ACM CSUR</addtitle><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><subject>Computing methodologies</subject><subject>General and reference</subject><subject>Reinforcement learning</subject><subject>Surveys and overviews</subject><issn>0360-0300</issn><issn>1557-7341</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</recordid><startdate>20230717</startdate><enddate>20230717</enddate><creator>Landers, Matthew</creator><creator>Doryab, Afsaneh</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid></search><sort><creationdate>20230717</creationdate><title>Deep Reinforcement Learning Verification: A Survey</title><author>Landers, Matthew ; Doryab, Afsaneh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computing methodologies</topic><topic>General and reference</topic><topic>Reinforcement learning</topic><topic>Surveys and overviews</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><collection>CrossRef</collection><jtitle>ACM computing surveys</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Landers, Matthew</au><au>Doryab, Afsaneh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning Verification: A Survey</atitle><jtitle>ACM computing surveys</jtitle><stitle>ACM CSUR</stitle><date>2023-07-17</date><risdate>2023</risdate><volume>55</volume><issue>14s</issue><spage>1</spage><epage>31</epage><pages>1-31</pages><artnum>330</artnum><issn>0360-0300</issn><eissn>1557-7341</eissn><abstract>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3596444</doi><tpages>31</tpages><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0360-0300 |
ispartof | ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330 |
issn | 0360-0300 1557-7341 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3596444 |
source | ACM Digital Library Complete |
subjects | Computing methodologies General and reference Reinforcement learning Surveys and overviews |
title | Deep Reinforcement Learning Verification: A Survey |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T06%3A18%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20Verification:%20A%20Survey&rft.jtitle=ACM%20computing%20surveys&rft.au=Landers,%20Matthew&rft.date=2023-07-17&rft.volume=55&rft.issue=14s&rft.spage=1&rft.epage=31&rft.pages=1-31&rft.artnum=330&rft.issn=0360-0300&rft.eissn=1557-7341&rft_id=info:doi/10.1145/3596444&rft_dat=%3Cacm_cross%3E3596444%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |