Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination

Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if ther...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Bethke, Brett M
Format:	Report
Sprache:	eng
Schlagworte:	ALGORITHMS BRE(BELLMAN RESIDUAL ELIMINATION) DECISION MAKING DYNAMIC PROGRAMMING ELIMINATION KERNEL FUNCTIONS MARKOV PROCESSES MDPS(MARKOV DECISION PROCESSES) Statistics and Probability THESES
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Bethke, Brett M
description	Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.
format	Report
fullrecord	<record><control><sourceid>dtic_1RU</sourceid><recordid>TN_cdi_dtic_stinet_ADA528927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ADA528927</sourcerecordid><originalsourceid>FETCH-dtic_stinet_ADA5289273</originalsourceid><addsrcrecordid>eNrjZPD3Ti3KS83RdUosTk1RcCwoKMqvyMxNLElVcKnMS8zNTFYIKMpPL0rMzc3MS1cILQaRTqk5ObmJeQpBqcWZKaWJOQquOZlA6cSSzPw8HgbWtMSc4lReKM3NIOPmGuLsoZtSkpkcX1ySmZdaEu_o4mhqZGFpZG5MQBoAH280Bg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>report</recordtype></control><display><type>report</type><title>Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination</title><source>DTIC Technical Reports</source><creator>Bethke, Brett M</creator><creatorcontrib>Bethke, Brett M ; MASSACHUSETTS INST OF TECH CAMBRIDGE</creatorcontrib><description>Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.</description><language>eng</language><subject>ALGORITHMS ; BRE(BELLMAN RESIDUAL ELIMINATION) ; DECISION MAKING ; DYNAMIC PROGRAMMING ; ELIMINATION ; KERNEL FUNCTIONS ; MARKOV PROCESSES ; MDPS(MARKOV DECISION PROCESSES) ; Statistics and Probability ; THESES</subject><creationdate>2010</creationdate><rights>Approved for public release; distribution is unlimited.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,777,882,27548,27549</link.rule.ids><linktorsrc>$$Uhttps://apps.dtic.mil/sti/citations/ADA528927$$EView_record_in_DTIC$$FView_record_in_$$GDTIC$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Bethke, Brett M</creatorcontrib><creatorcontrib>MASSACHUSETTS INST OF TECH CAMBRIDGE</creatorcontrib><title>Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination</title><description>Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.</description><subject>ALGORITHMS</subject><subject>BRE(BELLMAN RESIDUAL ELIMINATION)</subject><subject>DECISION MAKING</subject><subject>DYNAMIC PROGRAMMING</subject><subject>ELIMINATION</subject><subject>KERNEL FUNCTIONS</subject><subject>MARKOV PROCESSES</subject><subject>MDPS(MARKOV DECISION PROCESSES)</subject><subject>Statistics and Probability</subject><subject>THESES</subject><fulltext>true</fulltext><rsrctype>report</rsrctype><creationdate>2010</creationdate><recordtype>report</recordtype><sourceid>1RU</sourceid><recordid>eNrjZPD3Ti3KS83RdUosTk1RcCwoKMqvyMxNLElVcKnMS8zNTFYIKMpPL0rMzc3MS1cILQaRTqk5ObmJeQpBqcWZKaWJOQquOZlA6cSSzPw8HgbWtMSc4lReKM3NIOPmGuLsoZtSkpkcX1ySmZdaEu_o4mhqZGFpZG5MQBoAH280Bg</recordid><startdate>201002</startdate><enddate>201002</enddate><creator>Bethke, Brett M</creator><scope>1RU</scope><scope>BHM</scope></search><sort><creationdate>201002</creationdate><title>Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination</title><author>Bethke, Brett M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-dtic_stinet_ADA5289273</frbrgroupid><rsrctype>reports</rsrctype><prefilter>reports</prefilter><language>eng</language><creationdate>2010</creationdate><topic>ALGORITHMS</topic><topic>BRE(BELLMAN RESIDUAL ELIMINATION)</topic><topic>DECISION MAKING</topic><topic>DYNAMIC PROGRAMMING</topic><topic>ELIMINATION</topic><topic>KERNEL FUNCTIONS</topic><topic>MARKOV PROCESSES</topic><topic>MDPS(MARKOV DECISION PROCESSES)</topic><topic>Statistics and Probability</topic><topic>THESES</topic><toplevel>online_resources</toplevel><creatorcontrib>Bethke, Brett M</creatorcontrib><creatorcontrib>MASSACHUSETTS INST OF TECH CAMBRIDGE</creatorcontrib><collection>DTIC Technical Reports</collection><collection>DTIC STINET</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bethke, Brett M</au><aucorp>MASSACHUSETTS INST OF TECH CAMBRIDGE</aucorp><format>book</format><genre>unknown</genre><ristype>RPRT</ristype><btitle>Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination</btitle><date>2010-02</date><risdate>2010</risdate><abstract>Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_dtic_stinet_ADA528927
source	DTIC Technical Reports
subjects	ALGORITHMS BRE(BELLMAN RESIDUAL ELIMINATION) DECISION MAKING DYNAMIC PROGRAMMING ELIMINATION KERNEL FUNCTIONS MARKOV PROCESSES MDPS(MARKOV DECISION PROCESSES) Statistics and Probability THESES
title	Kernel-Based Approximate Dynamic Programming Using Bellman Residual Elimination
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T06%3A00%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-dtic_1RU&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=Kernel-Based%20Approximate%20Dynamic%20Programming%20Using%20Bellman%20Residual%20Elimination&rft.au=Bethke,%20Brett%20M&rft.aucorp=MASSACHUSETTS%20INST%20OF%20TECH%20CAMBRIDGE&rft.date=2010-02&rft_id=info:doi/&rft_dat=%3Cdtic_1RU%3EADA528927%3C/dtic_1RU%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true