Deep Reinforcement Learning Verification: A Survey

Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optim...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM computing surveys 2023-07, Vol.55 (14s), p.1-31, Article 330
Hauptverfasser:	Landers, Matthew, Doryab, Afsaneh
Format:	Artikel
Sprache:	eng
Schlagworte:	Computing methodologies General and reference Reinforcement learning Surveys and overviews
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	31
container_issue	14s
container_start_page	1
container_title	ACM computing surveys
container_volume	55
creator	Landers, Matthew Doryab, Afsaneh
description	Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.
doi_str_mv	10.1145/3596444
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3596444</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3596444</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</originalsourceid><addsrcrecordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deep Reinforcement Learning Verification: A Survey</title><source>ACM Digital Library Complete</source><creator>Landers, Matthew ; Doryab, Afsaneh</creator><creatorcontrib>Landers, Matthew ; Doryab, Afsaneh</creatorcontrib><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><identifier>ISSN: 0360-0300</identifier><identifier>EISSN: 1557-7341</identifier><identifier>DOI: 10.1145/3596444</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; General and reference ; Reinforcement learning ; Surveys and overviews</subject><ispartof>ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330</ispartof><rights>Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</citedby><cites>FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</cites><orcidid>0000-0003-2709-3596 ; 0000-0003-1575-385X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3596444$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2276,27903,27904,40175,75974</link.rule.ids></links><search><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><title>Deep Reinforcement Learning Verification: A Survey</title><title>ACM computing surveys</title><addtitle>ACM CSUR</addtitle><description>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</description><subject>Computing methodologies</subject><subject>General and reference</subject><subject>Reinforcement learning</subject><subject>Surveys and overviews</subject><issn>0360-0300</issn><issn>1557-7341</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo9j81Lw0AUxBdRMFbx7ik3T7Fvv11vpVYrBAS_ruEleSsrZlM2Ueh_b6TVwzAD82NgGDvncMW50nOpnVFKHbCMa20LKxU_ZBlIAwVIgGN2MgwfACAUNxkTt0Sb_IlC9H1qqKM45iVhiiG-52-Ugg8NjqGPN_kif_5K37Q9ZUcePwc62_uMvd6tXpbrony8f1guygKFtWMhsQWnWyG0U4aMFYR-kpN2yq7msnYOrQYPXmpQrp0IZa69QcGNxlrO2OVut0n9MCTy1SaFDtO24lD9Xq32VyfyYkdi0_1Df-UPArVMBw</recordid><startdate>20230717</startdate><enddate>20230717</enddate><creator>Landers, Matthew</creator><creator>Doryab, Afsaneh</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid></search><sort><creationdate>20230717</creationdate><title>Deep Reinforcement Learning Verification: A Survey</title><author>Landers, Matthew ; Doryab, Afsaneh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-3ad095d225946e672eaf2ea9376729b13b99a750f0f35049d72e468f6a2165ab3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computing methodologies</topic><topic>General and reference</topic><topic>Reinforcement learning</topic><topic>Surveys and overviews</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Landers, Matthew</creatorcontrib><creatorcontrib>Doryab, Afsaneh</creatorcontrib><collection>CrossRef</collection><jtitle>ACM computing surveys</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Landers, Matthew</au><au>Doryab, Afsaneh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning Verification: A Survey</atitle><jtitle>ACM computing surveys</jtitle><stitle>ACM CSUR</stitle><date>2023-07-17</date><risdate>2023</risdate><volume>55</volume><issue>14s</issue><spage>1</spage><epage>31</epage><pages>1-31</pages><artnum>330</artnum><issn>0360-0300</issn><eissn>1557-7341</eissn><abstract>Deep reinforcement learning (DRL) has proven capable of superhuman performance on many complex tasks. To achieve this success, DRL algorithms train a decision-making agent to select the actions that maximize some long-term performance measure. In many consequential real-world domains, however, optimal performance is not enough to justify an algorithm’s use—for example, sometimes a system’s robustness, stability, or safety must be rigorously ensured. Thus, methods for verifying DRL systems have emerged. These algorithms can guarantee a system’s properties over an infinite set of inputs, but the task is not trivial. DRL relies on deep neural networks (DNNs). DNNs are often referred to as “black boxes” because examining their respective structures does not elucidate their decision-making processes. Moreover, the sequential nature of the problems DRL is used to solve promotes significant scalability challenges. Finally, because DRL environments are often stochastic, verification methods must account for probabilistic behavior. To address these complications, a new subfield has emerged. In this survey, we establish the foundations of DRL and DRL verification, define a taxonomy for DRL verification methods, describe approaches for dealing with stochasticity, characterize considerations related to writing specifications, enumerate common testing tasks/environments, and detail opportunities for future research.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3596444</doi><tpages>31</tpages><orcidid>https://orcid.org/0000-0003-2709-3596</orcidid><orcidid>https://orcid.org/0000-0003-1575-385X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0360-0300
ispartof	ACM computing surveys, 2023-07, Vol.55 (14s), p.1-31, Article 330
issn	0360-0300 1557-7341
language	eng
recordid	cdi_crossref_primary_10_1145_3596444
source	ACM Digital Library Complete
subjects	Computing methodologies General and reference Reinforcement learning Surveys and overviews
title	Deep Reinforcement Learning Verification: A Survey
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T06%3A18%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20Verification:%20A%20Survey&rft.jtitle=ACM%20computing%20surveys&rft.au=Landers,%20Matthew&rft.date=2023-07-17&rft.volume=55&rft.issue=14s&rft.spage=1&rft.epage=31&rft.pages=1-31&rft.artnum=330&rft.issn=0360-0300&rft.eissn=1557-7341&rft_id=info:doi/10.1145/3596444&rft_dat=%3Cacm_cross%3E3596444%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true