Path planning method and system based on reinforcement learning and heuristic search
The invention discloses a path planning method and system based on reinforcement learning and heuristic search. The method comprises the steps of S1 establishing an environment model under a Markov decision process framework, wherein the state space of the environment model is S, the action space of...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | ZHANG XIULING KANG XUENAN LI JINXIANG |
description | The invention discloses a path planning method and system based on reinforcement learning and heuristic search. The method comprises the steps of S1 establishing an environment model under a Markov decision process framework, wherein the state space of the environment model is S, the action space of the environment model is A, the reward function of the environment model is R, and the transition probability function of the environment model is P; S2 performing sampling updating on the environment model through a Dyna-Q algorithm, evaluating each state-action pair, and determining a target point; S3 based on the target point, calculating Euclidean distances between the current position and the starting point and between the current position and the target point through an A * algorithm, anddetermining an initial path; S4 assigning a value to each state-action pair in the initial path; S5 determining an optimal action according to the evaluation value and assignment of each state-actionpair; and S6 determining |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN111896006A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN111896006A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN111896006A3</originalsourceid><addsrcrecordid>eNqNi80KwjAQBnvxIOo7rA8gNAhFj1IUT-Kh97ImX02g2ZRsPPj2_uADeBoYZuZVd-XiaRpZJMidIopPjlgc6VMLIt1Y4SgJZQQZUraIkEIjOH-PT-rxyEFLsKRvbf2ymg08KlY_Lqr16di15w2m1EMnthCUvr0YY3b7pq6bw_af5gWh2Tju</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Path planning method and system based on reinforcement learning and heuristic search</title><source>esp@cenet</source><creator>ZHANG XIULING ; KANG XUENAN ; LI JINXIANG</creator><creatorcontrib>ZHANG XIULING ; KANG XUENAN ; LI JINXIANG</creatorcontrib><description>The invention discloses a path planning method and system based on reinforcement learning and heuristic search. The method comprises the steps of S1 establishing an environment model under a Markov decision process framework, wherein the state space of the environment model is S, the action space of the environment model is A, the reward function of the environment model is R, and the transition probability function of the environment model is P; S2 performing sampling updating on the environment model through a Dyna-Q algorithm, evaluating each state-action pair, and determining a target point; S3 based on the target point, calculating Euclidean distances between the current position and the starting point and between the current position and the target point through an A * algorithm, anddetermining an initial path; S4 assigning a value to each state-action pair in the initial path; S5 determining an optimal action according to the evaluation value and assignment of each state-actionpair; and S6 determining</description><language>chi ; eng</language><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL ; CONTROLLING ; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS ; GYROSCOPIC INSTRUMENTS ; MEASURING ; MEASURING DISTANCES, LEVELS OR BEARINGS ; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS ; NAVIGATION ; PHOTOGRAMMETRY OR VIDEOGRAMMETRY ; PHYSICS ; REGULATING ; SURVEYING ; SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES ; TESTING</subject><creationdate>2020</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20201106&DB=EPODOC&CC=CN&NR=111896006A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20201106&DB=EPODOC&CC=CN&NR=111896006A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>ZHANG XIULING</creatorcontrib><creatorcontrib>KANG XUENAN</creatorcontrib><creatorcontrib>LI JINXIANG</creatorcontrib><title>Path planning method and system based on reinforcement learning and heuristic search</title><description>The invention discloses a path planning method and system based on reinforcement learning and heuristic search. The method comprises the steps of S1 establishing an environment model under a Markov decision process framework, wherein the state space of the environment model is S, the action space of the environment model is A, the reward function of the environment model is R, and the transition probability function of the environment model is P; S2 performing sampling updating on the environment model through a Dyna-Q algorithm, evaluating each state-action pair, and determining a target point; S3 based on the target point, calculating Euclidean distances between the current position and the starting point and between the current position and the target point through an A * algorithm, anddetermining an initial path; S4 assigning a value to each state-action pair in the initial path; S5 determining an optimal action according to the evaluation value and assignment of each state-actionpair; and S6 determining</description><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL</subject><subject>CONTROLLING</subject><subject>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</subject><subject>GYROSCOPIC INSTRUMENTS</subject><subject>MEASURING</subject><subject>MEASURING DISTANCES, LEVELS OR BEARINGS</subject><subject>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</subject><subject>NAVIGATION</subject><subject>PHOTOGRAMMETRY OR VIDEOGRAMMETRY</subject><subject>PHYSICS</subject><subject>REGULATING</subject><subject>SURVEYING</subject><subject>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</subject><subject>TESTING</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2020</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNi80KwjAQBnvxIOo7rA8gNAhFj1IUT-Kh97ImX02g2ZRsPPj2_uADeBoYZuZVd-XiaRpZJMidIopPjlgc6VMLIt1Y4SgJZQQZUraIkEIjOH-PT-rxyEFLsKRvbf2ymg08KlY_Lqr16di15w2m1EMnthCUvr0YY3b7pq6bw_af5gWh2Tju</recordid><startdate>20201106</startdate><enddate>20201106</enddate><creator>ZHANG XIULING</creator><creator>KANG XUENAN</creator><creator>LI JINXIANG</creator><scope>EVB</scope></search><sort><creationdate>20201106</creationdate><title>Path planning method and system based on reinforcement learning and heuristic search</title><author>ZHANG XIULING ; KANG XUENAN ; LI JINXIANG</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN111896006A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2020</creationdate><topic>CONTROL OR REGULATING SYSTEMS IN GENERAL</topic><topic>CONTROLLING</topic><topic>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</topic><topic>GYROSCOPIC INSTRUMENTS</topic><topic>MEASURING</topic><topic>MEASURING DISTANCES, LEVELS OR BEARINGS</topic><topic>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</topic><topic>NAVIGATION</topic><topic>PHOTOGRAMMETRY OR VIDEOGRAMMETRY</topic><topic>PHYSICS</topic><topic>REGULATING</topic><topic>SURVEYING</topic><topic>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</topic><topic>TESTING</topic><toplevel>online_resources</toplevel><creatorcontrib>ZHANG XIULING</creatorcontrib><creatorcontrib>KANG XUENAN</creatorcontrib><creatorcontrib>LI JINXIANG</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>ZHANG XIULING</au><au>KANG XUENAN</au><au>LI JINXIANG</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Path planning method and system based on reinforcement learning and heuristic search</title><date>2020-11-06</date><risdate>2020</risdate><abstract>The invention discloses a path planning method and system based on reinforcement learning and heuristic search. The method comprises the steps of S1 establishing an environment model under a Markov decision process framework, wherein the state space of the environment model is S, the action space of the environment model is A, the reward function of the environment model is R, and the transition probability function of the environment model is P; S2 performing sampling updating on the environment model through a Dyna-Q algorithm, evaluating each state-action pair, and determining a target point; S3 based on the target point, calculating Euclidean distances between the current position and the starting point and between the current position and the target point through an A * algorithm, anddetermining an initial path; S4 assigning a value to each state-action pair in the initial path; S5 determining an optimal action according to the evaluation value and assignment of each state-actionpair; and S6 determining</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN111896006A |
source | esp@cenet |
subjects | CONTROL OR REGULATING SYSTEMS IN GENERAL CONTROLLING FUNCTIONAL ELEMENTS OF SUCH SYSTEMS GYROSCOPIC INSTRUMENTS MEASURING MEASURING DISTANCES, LEVELS OR BEARINGS MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS NAVIGATION PHOTOGRAMMETRY OR VIDEOGRAMMETRY PHYSICS REGULATING SURVEYING SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES TESTING |
title | Path planning method and system based on reinforcement learning and heuristic search |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T04%3A34%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=ZHANG%20XIULING&rft.date=2020-11-06&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN111896006A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |