Learning Partially Observable Deterministic Action Models

We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of artificial intelligence research 2008-01, Vol.33, p.349-402
Hauptverfasser:	Amir, E., Chang, A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial intelligence Domains Machine learning Polynomials
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	402
container_issue
container_start_page	349
container_title	The Journal of artificial intelligence research
container_volume	33
creator	Amir, E. Chang, A.
description	We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.
doi_str_mv	10.1613/jair.2575
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2554115452</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2554115452</sourcerecordid><originalsourceid>FETCH-LOGICAL-c257t-5297bae56bd98c250faf0908f422b58bbe2a1608c11874bcd0e80e86cc6298213</originalsourceid><addsrcrecordid>eNpNkEFLxDAQhYMouK4e_AcFTx66ZtKmSY7L6qpQWQ96Dkk6lSzddk2ywv57W9aD8GAej8fM8BFyC3QBFRQPW-PDgnHBz8gMqKhyJbg4_-cvyVWMW0pBlUzOiKrRhN73X9m7CcmbrjtmGxsx_BjbYfaICcPO9z4m77KlS37os7ehwS5ek4vWdBFv_uacfK6fPlYveb15fl0t69yNb6ScMyWsQV7ZRskxoq1pqaKyLRmzXFqLzEBFpQOQorSuoShHVc5VTEkGxZzcnfbuw_B9wJj0djiEfjypGeclAC85G1v3p5YLQ4wBW70PfmfCUQPVExk9kdETmeIXgMNVxA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2554115452</pqid></control><display><type>article</type><title>Learning Partially Observable Deterministic Action Models</title><source>Ejournal Publishers (free content)</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Amir, E. ; Chang, A.</creator><creatorcontrib>Amir, E. ; Chang, A.</creatorcontrib><description>We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.</description><identifier>ISSN: 1076-9757</identifier><identifier>EISSN: 1076-9757</identifier><identifier>EISSN: 1943-5037</identifier><identifier>DOI: 10.1613/jair.2575</identifier><language>eng</language><publisher>San Francisco: AI Access Foundation</publisher><subject>Algorithms ; Artificial intelligence ; Domains ; Machine learning ; Polynomials</subject><ispartof>The Journal of artificial intelligence research, 2008-01, Vol.33, p.349-402</ispartof><rights>2008. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://www.jair.org/index.php/jair/about</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c257t-5297bae56bd98c250faf0908f422b58bbe2a1608c11874bcd0e80e86cc6298213</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,27924,27925</link.rule.ids></links><search><creatorcontrib>Amir, E.</creatorcontrib><creatorcontrib>Chang, A.</creatorcontrib><title>Learning Partially Observable Deterministic Action Models</title><title>The Journal of artificial intelligence research</title><description>We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Domains</subject><subject>Machine learning</subject><subject>Polynomials</subject><issn>1076-9757</issn><issn>1076-9757</issn><issn>1943-5037</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpNkEFLxDAQhYMouK4e_AcFTx66ZtKmSY7L6qpQWQ96Dkk6lSzddk2ywv57W9aD8GAej8fM8BFyC3QBFRQPW-PDgnHBz8gMqKhyJbg4_-cvyVWMW0pBlUzOiKrRhN73X9m7CcmbrjtmGxsx_BjbYfaICcPO9z4m77KlS37os7ehwS5ek4vWdBFv_uacfK6fPlYveb15fl0t69yNb6ScMyWsQV7ZRskxoq1pqaKyLRmzXFqLzEBFpQOQorSuoShHVc5VTEkGxZzcnfbuw_B9wJj0djiEfjypGeclAC85G1v3p5YLQ4wBW70PfmfCUQPVExk9kdETmeIXgMNVxA</recordid><startdate>20080101</startdate><enddate>20080101</enddate><creator>Amir, E.</creator><creator>Chang, A.</creator><general>AI Access Foundation</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20080101</creationdate><title>Learning Partially Observable Deterministic Action Models</title><author>Amir, E. ; Chang, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c257t-5297bae56bd98c250faf0908f422b58bbe2a1608c11874bcd0e80e86cc6298213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Domains</topic><topic>Machine learning</topic><topic>Polynomials</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Amir, E.</creatorcontrib><creatorcontrib>Chang, A.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>The Journal of artificial intelligence research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Amir, E.</au><au>Chang, A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Partially Observable Deterministic Action Models</atitle><jtitle>The Journal of artificial intelligence research</jtitle><date>2008-01-01</date><risdate>2008</risdate><volume>33</volume><spage>349</spage><epage>402</epage><pages>349-402</pages><issn>1076-9757</issn><eissn>1076-9757</eissn><eissn>1943-5037</eissn><abstract>We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.</abstract><cop>San Francisco</cop><pub>AI Access Foundation</pub><doi>10.1613/jair.2575</doi><tpages>54</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1076-9757
ispartof	The Journal of artificial intelligence research, 2008-01, Vol.33, p.349-402
issn	1076-9757 1076-9757 1943-5037
language	eng
recordid	cdi_proquest_journals_2554115452
source	Ejournal Publishers (free content); DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Algorithms Artificial intelligence Domains Machine learning Polynomials
title	Learning Partially Observable Deterministic Action Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T04%3A19%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Partially%20Observable%20Deterministic%20Action%20Models&rft.jtitle=The%20Journal%20of%20artificial%20intelligence%20research&rft.au=Amir,%20E.&rft.date=2008-01-01&rft.volume=33&rft.spage=349&rft.epage=402&rft.pages=349-402&rft.issn=1076-9757&rft.eissn=1076-9757&rft_id=info:doi/10.1613/jair.2575&rft_dat=%3Cproquest_cross%3E2554115452%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2554115452&rft_id=info:pmid/&rfr_iscdi=true