A framework and a mean-field algorithm for the local control of spatial processes

The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and ded...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of approximate reasoning 2012, Vol.53 (1), p.66-86
Hauptverfasser: Sabbadin, Régis, Peyrard, Nathalie, Forsell, Nicklas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 86
container_issue 1
container_start_page 66
container_title International journal of approximate reasoning
container_volume 53
creator Sabbadin, Régis
Peyrard, Nathalie
Forsell, Nicklas
description The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and dedicated solution algorithms have been introduced to deal with large factored state spaces. But the case of large action spaces remains an issue. In this article, we define graph-based Markov Decision Processes (GMDPs), a particular Factored MDP framework which exploits the factorization of the state space and the action space of a decision problem. Both spaces are assumed to have the same dimension. Transition probabilities and rewards are factored according to a single graph structure, where nodes represent pairs of state/decision variables of the problem. The complexity of this representation grows only linearly with the size of the graph, whereas the complexity of exact resolution grows exponentially. We propose an approximate solution algorithm exploiting the structure of a GMDP and whose complexity only grows quadratically with the size of the graph and exponentially with the maximum number of neighbours of any node. This algorithm, referred to as MF-API, belongs to the family of Approximate Policy Iteration (API) algorithms. It relies on a mean-field approximation of the value function of a policy and on a search limited to the suboptimal set of local policies. We compare it, in terms of performance, with two state-of-the-art algorithms for Factored MDPs: SPUDD and Approximate Linear Programming (ALP). Our experiments show that SPUDD is not generally applicable to solving GMDPs, due to the size of the action space we want to tackle. On the other hand, ALP can be adapted to solve GMDPs. We show that ALP is faster than MF-API and provides solutions of similar quality for most problems. However, for some problems MF-API provides significantly better policies, and in all cases provides a better approximation of the value function of approximate policies. These promising results show that the GMDP model offers a convenient framework for modelling and solving a large range of spatial and structured planning problems, that can arise in many different domains where processes are managed over networks: natural resources, agriculture, computer networks, etc.
doi_str_mv 10.1016/j.ijar.2011.09.007
format Article
fullrecord <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_00833151v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0888613X11001435</els_id><sourcerecordid>1022889244</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-933a6b9dac8acf5e3c4956ec88f0c450118d660642ad05146a956ccdffd80dd43</originalsourceid><addsrcrecordid>eNp9kE2LFDEQhoMoOK7-AU-5CHrottL56DR4GRZ1hQERFLyFmFScjOnOmPSu-O9NM8sePYVUnnqr8hDykkHPgKm3pz6ebOkHYKyHqQcYH5Ed0yPvxMjZY7IDrXWnGP_-lDyr9QQAahR6R77saSh2xj-5_KJ28dTSGe3ShYipXdLPXOJ6nGnIha5HpCk7m6jLy1pyojnQerZrbKVzyQ5rxfqcPAk2VXxxf16Rbx_ef72-6Q6fP3663h86JwRbu4lzq35M3jptXZDInZikQqd1ACdk-4j2SoESg_UgmVC2PTvnQ_AavBf8iry55B5tMucSZ1v-mmyjudkfzFYD0Jwzye5YY19f2Lbl71usq5ljdZiSXTDfVsNgGLSeBrHFDhfUlVxrwfCQzcBsrs3JbK7N5trA1MaMrenVfb6tzU8zurhYHzoHKQctpW7cuwuHTcxdxGKqi7g49LGgW43P8X9j_gHRy5Py</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1022889244</pqid></control><display><type>article</type><title>A framework and a mean-field algorithm for the local control of spatial processes</title><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Sabbadin, Régis ; Peyrard, Nathalie ; Forsell, Nicklas</creator><creatorcontrib>Sabbadin, Régis ; Peyrard, Nathalie ; Forsell, Nicklas</creatorcontrib><description>The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and dedicated solution algorithms have been introduced to deal with large factored state spaces. But the case of large action spaces remains an issue. In this article, we define graph-based Markov Decision Processes (GMDPs), a particular Factored MDP framework which exploits the factorization of the state space and the action space of a decision problem. Both spaces are assumed to have the same dimension. Transition probabilities and rewards are factored according to a single graph structure, where nodes represent pairs of state/decision variables of the problem. The complexity of this representation grows only linearly with the size of the graph, whereas the complexity of exact resolution grows exponentially. We propose an approximate solution algorithm exploiting the structure of a GMDP and whose complexity only grows quadratically with the size of the graph and exponentially with the maximum number of neighbours of any node. This algorithm, referred to as MF-API, belongs to the family of Approximate Policy Iteration (API) algorithms. It relies on a mean-field approximation of the value function of a policy and on a search limited to the suboptimal set of local policies. We compare it, in terms of performance, with two state-of-the-art algorithms for Factored MDPs: SPUDD and Approximate Linear Programming (ALP). Our experiments show that SPUDD is not generally applicable to solving GMDPs, due to the size of the action space we want to tackle. On the other hand, ALP can be adapted to solve GMDPs. We show that ALP is faster than MF-API and provides solutions of similar quality for most problems. However, for some problems MF-API provides significantly better policies, and in all cases provides a better approximation of the value function of approximate policies. These promising results show that the GMDP model offers a convenient framework for modelling and solving a large range of spatial and structured planning problems, that can arise in many different domains where processes are managed over networks: natural resources, agriculture, computer networks, etc.</description><identifier>ISSN: 0888-613X</identifier><identifier>EISSN: 1873-4731</identifier><identifier>DOI: 10.1016/j.ijar.2011.09.007</identifier><identifier>CODEN: IJARE4</identifier><language>eng</language><publisher>Amsterdam: Elsevier Inc</publisher><subject>Algorithms ; Applied sciences ; Approximate linear programming ; Approximate policy iteration ; Approximation ; Complexity ; Decision theory. Utility theory ; Decision-theoretic planning ; domain_sde.plan ; domain_sde.plt ; Environmental Sciences ; Exact sciences and technology ; Factored Markov decision processes ; Graphs ; Markov processes ; Mathematical analysis ; Mathematical models ; Mathematical programming ; Mathematics ; Mean-field principle ; Modelling ; Operational research and scientific management ; Operational research. Management science ; Policies ; Probability and statistics ; Probability theory and stochastic processes ; Sciences and techniques of general use</subject><ispartof>International journal of approximate reasoning, 2012, Vol.53 (1), p.66-86</ispartof><rights>2011 Elsevier Inc.</rights><rights>2015 INIST-CNRS</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-933a6b9dac8acf5e3c4956ec88f0c450118d660642ad05146a956ccdffd80dd43</citedby><cites>FETCH-LOGICAL-c441t-933a6b9dac8acf5e3c4956ec88f0c450118d660642ad05146a956ccdffd80dd43</cites><orcidid>0000-0002-0356-1255 ; 0000-0002-6286-1821</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0888613X11001435$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,3536,4009,27902,27903,27904,65309</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=25528558$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://minesparis-psl.hal.science/hal-00833151$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Sabbadin, Régis</creatorcontrib><creatorcontrib>Peyrard, Nathalie</creatorcontrib><creatorcontrib>Forsell, Nicklas</creatorcontrib><title>A framework and a mean-field algorithm for the local control of spatial processes</title><title>International journal of approximate reasoning</title><description>The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and dedicated solution algorithms have been introduced to deal with large factored state spaces. But the case of large action spaces remains an issue. In this article, we define graph-based Markov Decision Processes (GMDPs), a particular Factored MDP framework which exploits the factorization of the state space and the action space of a decision problem. Both spaces are assumed to have the same dimension. Transition probabilities and rewards are factored according to a single graph structure, where nodes represent pairs of state/decision variables of the problem. The complexity of this representation grows only linearly with the size of the graph, whereas the complexity of exact resolution grows exponentially. We propose an approximate solution algorithm exploiting the structure of a GMDP and whose complexity only grows quadratically with the size of the graph and exponentially with the maximum number of neighbours of any node. This algorithm, referred to as MF-API, belongs to the family of Approximate Policy Iteration (API) algorithms. It relies on a mean-field approximation of the value function of a policy and on a search limited to the suboptimal set of local policies. We compare it, in terms of performance, with two state-of-the-art algorithms for Factored MDPs: SPUDD and Approximate Linear Programming (ALP). Our experiments show that SPUDD is not generally applicable to solving GMDPs, due to the size of the action space we want to tackle. On the other hand, ALP can be adapted to solve GMDPs. We show that ALP is faster than MF-API and provides solutions of similar quality for most problems. However, for some problems MF-API provides significantly better policies, and in all cases provides a better approximation of the value function of approximate policies. These promising results show that the GMDP model offers a convenient framework for modelling and solving a large range of spatial and structured planning problems, that can arise in many different domains where processes are managed over networks: natural resources, agriculture, computer networks, etc.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Approximate linear programming</subject><subject>Approximate policy iteration</subject><subject>Approximation</subject><subject>Complexity</subject><subject>Decision theory. Utility theory</subject><subject>Decision-theoretic planning</subject><subject>domain_sde.plan</subject><subject>domain_sde.plt</subject><subject>Environmental Sciences</subject><subject>Exact sciences and technology</subject><subject>Factored Markov decision processes</subject><subject>Graphs</subject><subject>Markov processes</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Mathematical programming</subject><subject>Mathematics</subject><subject>Mean-field principle</subject><subject>Modelling</subject><subject>Operational research and scientific management</subject><subject>Operational research. Management science</subject><subject>Policies</subject><subject>Probability and statistics</subject><subject>Probability theory and stochastic processes</subject><subject>Sciences and techniques of general use</subject><issn>0888-613X</issn><issn>1873-4731</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNp9kE2LFDEQhoMoOK7-AU-5CHrottL56DR4GRZ1hQERFLyFmFScjOnOmPSu-O9NM8sePYVUnnqr8hDykkHPgKm3pz6ebOkHYKyHqQcYH5Ed0yPvxMjZY7IDrXWnGP_-lDyr9QQAahR6R77saSh2xj-5_KJ28dTSGe3ShYipXdLPXOJ6nGnIha5HpCk7m6jLy1pyojnQerZrbKVzyQ5rxfqcPAk2VXxxf16Rbx_ef72-6Q6fP3663h86JwRbu4lzq35M3jptXZDInZikQqd1ACdk-4j2SoESg_UgmVC2PTvnQ_AavBf8iry55B5tMucSZ1v-mmyjudkfzFYD0Jwzye5YY19f2Lbl71usq5ljdZiSXTDfVsNgGLSeBrHFDhfUlVxrwfCQzcBsrs3JbK7N5trA1MaMrenVfb6tzU8zurhYHzoHKQctpW7cuwuHTcxdxGKqi7g49LGgW43P8X9j_gHRy5Py</recordid><startdate>2012</startdate><enddate>2012</enddate><creator>Sabbadin, Régis</creator><creator>Peyrard, Nathalie</creator><creator>Forsell, Nicklas</creator><general>Elsevier Inc</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0002-0356-1255</orcidid><orcidid>https://orcid.org/0000-0002-6286-1821</orcidid></search><sort><creationdate>2012</creationdate><title>A framework and a mean-field algorithm for the local control of spatial processes</title><author>Sabbadin, Régis ; Peyrard, Nathalie ; Forsell, Nicklas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-933a6b9dac8acf5e3c4956ec88f0c450118d660642ad05146a956ccdffd80dd43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Approximate linear programming</topic><topic>Approximate policy iteration</topic><topic>Approximation</topic><topic>Complexity</topic><topic>Decision theory. Utility theory</topic><topic>Decision-theoretic planning</topic><topic>domain_sde.plan</topic><topic>domain_sde.plt</topic><topic>Environmental Sciences</topic><topic>Exact sciences and technology</topic><topic>Factored Markov decision processes</topic><topic>Graphs</topic><topic>Markov processes</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Mathematical programming</topic><topic>Mathematics</topic><topic>Mean-field principle</topic><topic>Modelling</topic><topic>Operational research and scientific management</topic><topic>Operational research. Management science</topic><topic>Policies</topic><topic>Probability and statistics</topic><topic>Probability theory and stochastic processes</topic><topic>Sciences and techniques of general use</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sabbadin, Régis</creatorcontrib><creatorcontrib>Peyrard, Nathalie</creatorcontrib><creatorcontrib>Forsell, Nicklas</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>International journal of approximate reasoning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sabbadin, Régis</au><au>Peyrard, Nathalie</au><au>Forsell, Nicklas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A framework and a mean-field algorithm for the local control of spatial processes</atitle><jtitle>International journal of approximate reasoning</jtitle><date>2012</date><risdate>2012</risdate><volume>53</volume><issue>1</issue><spage>66</spage><epage>86</epage><pages>66-86</pages><issn>0888-613X</issn><eissn>1873-4731</eissn><coden>IJARE4</coden><abstract>The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and dedicated solution algorithms have been introduced to deal with large factored state spaces. But the case of large action spaces remains an issue. In this article, we define graph-based Markov Decision Processes (GMDPs), a particular Factored MDP framework which exploits the factorization of the state space and the action space of a decision problem. Both spaces are assumed to have the same dimension. Transition probabilities and rewards are factored according to a single graph structure, where nodes represent pairs of state/decision variables of the problem. The complexity of this representation grows only linearly with the size of the graph, whereas the complexity of exact resolution grows exponentially. We propose an approximate solution algorithm exploiting the structure of a GMDP and whose complexity only grows quadratically with the size of the graph and exponentially with the maximum number of neighbours of any node. This algorithm, referred to as MF-API, belongs to the family of Approximate Policy Iteration (API) algorithms. It relies on a mean-field approximation of the value function of a policy and on a search limited to the suboptimal set of local policies. We compare it, in terms of performance, with two state-of-the-art algorithms for Factored MDPs: SPUDD and Approximate Linear Programming (ALP). Our experiments show that SPUDD is not generally applicable to solving GMDPs, due to the size of the action space we want to tackle. On the other hand, ALP can be adapted to solve GMDPs. We show that ALP is faster than MF-API and provides solutions of similar quality for most problems. However, for some problems MF-API provides significantly better policies, and in all cases provides a better approximation of the value function of approximate policies. These promising results show that the GMDP model offers a convenient framework for modelling and solving a large range of spatial and structured planning problems, that can arise in many different domains where processes are managed over networks: natural resources, agriculture, computer networks, etc.</abstract><cop>Amsterdam</cop><pub>Elsevier Inc</pub><doi>10.1016/j.ijar.2011.09.007</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0002-0356-1255</orcidid><orcidid>https://orcid.org/0000-0002-6286-1821</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0888-613X
ispartof International journal of approximate reasoning, 2012, Vol.53 (1), p.66-86
issn 0888-613X
1873-4731
language eng
recordid cdi_hal_primary_oai_HAL_hal_00833151v1
source Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Algorithms
Applied sciences
Approximate linear programming
Approximate policy iteration
Approximation
Complexity
Decision theory. Utility theory
Decision-theoretic planning
domain_sde.plan
domain_sde.plt
Environmental Sciences
Exact sciences and technology
Factored Markov decision processes
Graphs
Markov processes
Mathematical analysis
Mathematical models
Mathematical programming
Mathematics
Mean-field principle
Modelling
Operational research and scientific management
Operational research. Management science
Policies
Probability and statistics
Probability theory and stochastic processes
Sciences and techniques of general use
title A framework and a mean-field algorithm for the local control of spatial processes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T22%3A41%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20framework%20and%20a%20mean-field%20algorithm%20for%20the%20local%20control%20of%20spatial%20processes&rft.jtitle=International%20journal%20of%20approximate%20reasoning&rft.au=Sabbadin,%20R%C3%A9gis&rft.date=2012&rft.volume=53&rft.issue=1&rft.spage=66&rft.epage=86&rft.pages=66-86&rft.issn=0888-613X&rft.eissn=1873-4731&rft.coden=IJARE4&rft_id=info:doi/10.1016/j.ijar.2011.09.007&rft_dat=%3Cproquest_hal_p%3E1022889244%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1022889244&rft_id=info:pmid/&rft_els_id=S0888613X11001435&rfr_iscdi=true