Deep Reinforcement Learning Verification: A Survey

Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optim...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM computing surveys 2023-07, Vol.55 (14s), p.1-31, Article 330
Hauptverfasser: Landers, Matthew, Doryab, Afsaneh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 31
container_issue 14s
container_start_page 1
container_title ACM computing surveys
container_volume 55
creator Landers, Matthew
Doryab, Afsaneh
description Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.
doi_str_mv 10.1145/3596444
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3596444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3596444</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</originalsourceid><addsrcrecordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Reinforcement Learning Verification: A Survey</title><source>ACM Digital Library Complete</source><creator>Landers, Matthew ; Doryab, Afsaneh</creator><creatorcontrib>Landers, Matthew ; Doryab, Afsaneh</creatorcontrib><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><identifier>ISSN: 0360-0300</identifier><identifier>EISSN: 1557-7341</identifier><identifier>DOI: 10.1145/3596444</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; General and reference ; Reinforcement learning ; Surveys and overviews</subject><ispartof>ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330</ispartof><rights>Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</citedby><cites>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</cites><orcidid>0000-0003-2709-3596 ; 0000-0003-1575-385X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3596444$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2276,27903,27904,40175,75974</link.rule.ids></links><search><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><title>Deep Reinforcement Learning Verification: A Survey</title><title>ACM computing surveys</title><addtitle>ACM CSUR</addtitle><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><subject>Computing methodologies</subject><subject>General and reference</subject><subject>Reinforcement learning</subject><subject>Surveys and overviews</subject><issn>0360-0300</issn><issn>1557-7341</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</recordid><startdate>20230717</startdate><enddate>20230717</enddate><creator>Landers, Matthew</creator><creator>Doryab, Afsaneh</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid></search><sort><creationdate>20230717</creationdate><title>Deep Reinforcement Learning Verification: A Survey</title><author>Landers, Matthew ; Doryab, Afsaneh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computing methodologies</topic><topic>General and reference</topic><topic>Reinforcement learning</topic><topic>Surveys and overviews</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><collection>CrossRef</collection><jtitle>ACM computing surveys</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Landers, Matthew</au><au>Doryab, Afsaneh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning Verification: A Survey</atitle><jtitle>ACM computing surveys</jtitle><stitle>ACM CSUR</stitle><date>2023-07-17</date><risdate>2023</risdate><volume>55</volume><issue>14s</issue><spage>1</spage><epage>31</epage><pages>1-31</pages><artnum>330</artnum><issn>0360-0300</issn><eissn>1557-7341</eissn><abstract>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3596444</doi><tpages>31</tpages><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0360-0300
ispartof ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330
issn 0360-0300
1557-7341
language eng
recordid cdi_crossref_primary_10_1145_3596444
source ACM Digital Library Complete
subjects Computing methodologies
General and reference
Reinforcement learning
Surveys and overviews
title Deep Reinforcement Learning Verification: A Survey
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T06%3A18%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20Verification:%20A%20Survey&rft.jtitle=ACM%20computing%20surveys&rft.au=Landers,%20Matthew&rft.date=2023-07-17&rft.volume=55&rft.issue=14s&rft.spage=1&rft.epage=31&rft.pages=1-31&rft.artnum=330&rft.issn=0360-0300&rft.eissn=1557-7341&rft_id=info:doi/10.1145/3596444&rft_dat=%3Cacm_cross%3E3596444%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true