Efficient solutions to factored MDPs with imprecise transition probabilities

When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Artificial intelligence 2011-06, Vol.175 (9), p.1498-1527
Hauptverfasser:	Delgado, Karina Valdivia, Sanner, Scott, de Barros, Leliane Nunes
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Approximation Artificial intelligence Computer science control theory systems Decision theory. Utility theory Exact sciences and technology Learning and adaptive systems Markov Decision Process Markov processes Mathematical analysis Mathematical models Mathematical programming Mathematics Nonlinearity Operational research and scientific management Operational research. Management science Optimization Probabilistic planning Probability and statistics Probability theory and stochastic processes Robust planning Sciences and techniques of general use Transition probabilities Uncertainty
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1527
container_issue	9
container_start_page	1498
container_title	Artificial intelligence
container_volume	175
creator	Delgado, Karina Valdivia Sanner, Scott de Barros, Leliane Nunes
description	When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated.
doi_str_mv	10.1016/j.artint.2011.01.001
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_901650451</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0004370211000026</els_id><sourcerecordid>901650451</sourcerecordid><originalsourceid>FETCH-LOGICAL-c414t-47bdb5d403ff3e62cd9aa4acbe114c81e0cf509b9db4aab655579273dc0ce9263</originalsourceid><addsrcrecordid>eNp9kE9LxDAQxYMouK5-Aw-5iKfWSZu2m4sguv6BFT3oOaTpBLN0mzXJKn57U7p4FAaGgTfz5v0IOWeQM2D11TpXPtoh5gUwlkMqYAdkxhZNkTWiYIdkBgA8KxsojslJCOs0lkKwGVktjbHa4hBpcP0uWjcEGh01SkfnsaPPd6-Bftv4Qe1m61HbgDR6NQQ7aunWu1a1tk8ThlNyZFQf8Gzf5-T9fvl2-5itXh6ebm9WmeaMx4w3bddWHYfSmBLrQndCKa50i4xxvWAI2lQgWtG1XKm2rqoqpWjKToNGUdTlnFxOd5P75w5DlBsbNPa9GtDtghQJSgW8YknJJ6X2LgSPRm693Sj_IxnIkZ1cy4mdHNlJSAXj2sXeQAWtepPypuB_uwUvmvRgk3TXkw5T2i-LXoaRpcbOJlRRds7-b_QLxSiIRA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>901650451</pqid></control><display><type>article</type><title>Efficient solutions to factored MDPs with imprecise transition probabilities</title><source>Elsevier ScienceDirect Journals Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Delgado, Karina Valdivia ; Sanner, Scott ; de Barros, Leliane Nunes</creator><creatorcontrib>Delgado, Karina Valdivia ; Sanner, Scott ; de Barros, Leliane Nunes</creatorcontrib><description>When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated.</description><identifier>ISSN: 0004-3702</identifier><identifier>EISSN: 1872-7921</identifier><identifier>DOI: 10.1016/j.artint.2011.01.001</identifier><identifier>CODEN: AINTBB</identifier><language>eng</language><publisher>Oxford: Elsevier B.V</publisher><subject>Algorithms ; Applied sciences ; Approximation ; Artificial intelligence ; Computer science; control theory; systems ; Decision theory. Utility theory ; Exact sciences and technology ; Learning and adaptive systems ; Markov Decision Process ; Markov processes ; Mathematical analysis ; Mathematical models ; Mathematical programming ; Mathematics ; Nonlinearity ; Operational research and scientific management ; Operational research. Management science ; Optimization ; Probabilistic planning ; Probability and statistics ; Probability theory and stochastic processes ; Robust planning ; Sciences and techniques of general use ; Transition probabilities ; Uncertainty</subject><ispartof>Artificial intelligence, 2011-06, Vol.175 (9), p.1498-1527</ispartof><rights>2011 Elsevier B.V.</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c414t-47bdb5d403ff3e62cd9aa4acbe114c81e0cf509b9db4aab655579273dc0ce9263</citedby><cites>FETCH-LOGICAL-c414t-47bdb5d403ff3e62cd9aa4acbe114c81e0cf509b9db4aab655579273dc0ce9263</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.artint.2011.01.001$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=24275097$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Delgado, Karina Valdivia</creatorcontrib><creatorcontrib>Sanner, Scott</creatorcontrib><creatorcontrib>de Barros, Leliane Nunes</creatorcontrib><title>Efficient solutions to factored MDPs with imprecise transition probabilities</title><title>Artificial intelligence</title><description>When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Approximation</subject><subject>Artificial intelligence</subject><subject>Computer science; control theory; systems</subject><subject>Decision theory. Utility theory</subject><subject>Exact sciences and technology</subject><subject>Learning and adaptive systems</subject><subject>Markov Decision Process</subject><subject>Markov processes</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Mathematical programming</subject><subject>Mathematics</subject><subject>Nonlinearity</subject><subject>Operational research and scientific management</subject><subject>Operational research. Management science</subject><subject>Optimization</subject><subject>Probabilistic planning</subject><subject>Probability and statistics</subject><subject>Probability theory and stochastic processes</subject><subject>Robust planning</subject><subject>Sciences and techniques of general use</subject><subject>Transition probabilities</subject><subject>Uncertainty</subject><issn>0004-3702</issn><issn>1872-7921</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LxDAQxYMouK5-Aw-5iKfWSZu2m4sguv6BFT3oOaTpBLN0mzXJKn57U7p4FAaGgTfz5v0IOWeQM2D11TpXPtoh5gUwlkMqYAdkxhZNkTWiYIdkBgA8KxsojslJCOs0lkKwGVktjbHa4hBpcP0uWjcEGh01SkfnsaPPd6-Bftv4Qe1m61HbgDR6NQQ7aunWu1a1tk8ThlNyZFQf8Gzf5-T9fvl2-5itXh6ebm9WmeaMx4w3bddWHYfSmBLrQndCKa50i4xxvWAI2lQgWtG1XKm2rqoqpWjKToNGUdTlnFxOd5P75w5DlBsbNPa9GtDtghQJSgW8YknJJ6X2LgSPRm693Sj_IxnIkZ1cy4mdHNlJSAXj2sXeQAWtepPypuB_uwUvmvRgk3TXkw5T2i-LXoaRpcbOJlRRds7-b_QLxSiIRA</recordid><startdate>20110601</startdate><enddate>20110601</enddate><creator>Delgado, Karina Valdivia</creator><creator>Sanner, Scott</creator><creator>de Barros, Leliane Nunes</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20110601</creationdate><title>Efficient solutions to factored MDPs with imprecise transition probabilities</title><author>Delgado, Karina Valdivia ; Sanner, Scott ; de Barros, Leliane Nunes</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c414t-47bdb5d403ff3e62cd9aa4acbe114c81e0cf509b9db4aab655579273dc0ce9263</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Approximation</topic><topic>Artificial intelligence</topic><topic>Computer science; control theory; systems</topic><topic>Decision theory. Utility theory</topic><topic>Exact sciences and technology</topic><topic>Learning and adaptive systems</topic><topic>Markov Decision Process</topic><topic>Markov processes</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Mathematical programming</topic><topic>Mathematics</topic><topic>Nonlinearity</topic><topic>Operational research and scientific management</topic><topic>Operational research. Management science</topic><topic>Optimization</topic><topic>Probabilistic planning</topic><topic>Probability and statistics</topic><topic>Probability theory and stochastic processes</topic><topic>Robust planning</topic><topic>Sciences and techniques of general use</topic><topic>Transition probabilities</topic><topic>Uncertainty</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Delgado, Karina Valdivia</creatorcontrib><creatorcontrib>Sanner, Scott</creatorcontrib><creatorcontrib>de Barros, Leliane Nunes</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Artificial intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Delgado, Karina Valdivia</au><au>Sanner, Scott</au><au>de Barros, Leliane Nunes</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient solutions to factored MDPs with imprecise transition probabilities</atitle><jtitle>Artificial intelligence</jtitle><date>2011-06-01</date><risdate>2011</risdate><volume>175</volume><issue>9</issue><spage>1498</spage><epage>1527</epage><pages>1498-1527</pages><issn>0004-3702</issn><eissn>1872-7921</eissn><coden>AINTBB</coden><abstract>When modeling real-world decision-theoretic planning problems in the Markov Decision Process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or estimation from data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while various solution algorithms exist for MDP-IPs, they often require external calls to optimization routines and thus can be extremely time-consuming in practice. To address this deficiency, we introduce the factored MDP-IP and propose efficient dynamic programming methods to exploit its structure. Noting that the key computational bottleneck in the solution of factored MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs while producing the lowest error of any approximation algorithm evaluated.</abstract><cop>Oxford</cop><pub>Elsevier B.V</pub><doi>10.1016/j.artint.2011.01.001</doi><tpages>30</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0004-3702
ispartof	Artificial intelligence, 2011-06, Vol.175 (9), p.1498-1527
issn	0004-3702 1872-7921
language	eng
recordid	cdi_proquest_miscellaneous_901650451
source	Elsevier ScienceDirect Journals Complete; EZB-FREE-00999 freely available EZB journals
subjects	Algorithms Applied sciences Approximation Artificial intelligence Computer science control theory systems Decision theory. Utility theory Exact sciences and technology Learning and adaptive systems Markov Decision Process Markov processes Mathematical analysis Mathematical models Mathematical programming Mathematics Nonlinearity Operational research and scientific management Operational research. Management science Optimization Probabilistic planning Probability and statistics Probability theory and stochastic processes Robust planning Sciences and techniques of general use Transition probabilities Uncertainty
title	Efficient solutions to factored MDPs with imprecise transition probabilities
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T09%3A11%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20solutions%20to%20factored%20MDPs%20with%20imprecise%20transition%20probabilities&rft.jtitle=Artificial%20intelligence&rft.au=Delgado,%20Karina%20Valdivia&rft.date=2011-06-01&rft.volume=175&rft.issue=9&rft.spage=1498&rft.epage=1527&rft.pages=1498-1527&rft.issn=0004-3702&rft.eissn=1872-7921&rft.coden=AINTBB&rft_id=info:doi/10.1016/j.artint.2011.01.001&rft_dat=%3Cproquest_cross%3E901650451%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=901650451&rft_id=info:pmid/&rft_els_id=S0004370211000026&rfr_iscdi=true