Vehicle optimal control method based on deep reinforcement learning
The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | HUANG XIWEN HUANG XIANGDANG FEI HANSHENG YANG QIULING |
description | The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN118372851A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN118372851A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN118372851A3</originalsourceid><addsrcrecordid>eNqNyjEOAiEQBVAaC6PeYTyABW6M2xqisbIythuEvy7JMEOA-8fGA1i95q2Ne2FJgUFaesqeKaj0qkwZfdFIb98QSYUiUKgiyaw1IEM6MXyVJJ-tWc2eG3Y_N2Z_uz7d_YCiE1rxAYI-uYe143A-jid7Gf45X445MqU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Vehicle optimal control method based on deep reinforcement learning</title><source>esp@cenet</source><creator>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</creator><creatorcontrib>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</creatorcontrib><description>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION ; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES ; CONTROLLING ; COUNTING ; PERFORMING OPERATIONS ; PHYSICS ; REGULATING ; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT ; SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES ; TRANSPORTING ; VEHICLES IN GENERAL</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240723&DB=EPODOC&CC=CN&NR=118372851A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20240723&DB=EPODOC&CC=CN&NR=118372851A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>HUANG XIWEN</creatorcontrib><creatorcontrib>HUANG XIANGDANG</creatorcontrib><creatorcontrib>FEI HANSHENG</creatorcontrib><creatorcontrib>YANG QIULING</creatorcontrib><title>Vehicle optimal control method based on deep reinforcement learning</title><description>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION</subject><subject>CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES</subject><subject>CONTROLLING</subject><subject>COUNTING</subject><subject>PERFORMING OPERATIONS</subject><subject>PHYSICS</subject><subject>REGULATING</subject><subject>ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT</subject><subject>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</subject><subject>TRANSPORTING</subject><subject>VEHICLES IN GENERAL</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyjEOAiEQBVAaC6PeYTyABW6M2xqisbIythuEvy7JMEOA-8fGA1i95q2Ne2FJgUFaesqeKaj0qkwZfdFIb98QSYUiUKgiyaw1IEM6MXyVJJ-tWc2eG3Y_N2Z_uz7d_YCiE1rxAYI-uYe143A-jid7Gf45X445MqU</recordid><startdate>20240723</startdate><enddate>20240723</enddate><creator>HUANG XIWEN</creator><creator>HUANG XIANGDANG</creator><creator>FEI HANSHENG</creator><creator>YANG QIULING</creator><scope>EVB</scope></search><sort><creationdate>20240723</creationdate><title>Vehicle optimal control method based on deep reinforcement learning</title><author>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN118372851A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION</topic><topic>CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES</topic><topic>CONTROLLING</topic><topic>COUNTING</topic><topic>PERFORMING OPERATIONS</topic><topic>PHYSICS</topic><topic>REGULATING</topic><topic>ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT</topic><topic>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</topic><topic>TRANSPORTING</topic><topic>VEHICLES IN GENERAL</topic><toplevel>online_resources</toplevel><creatorcontrib>HUANG XIWEN</creatorcontrib><creatorcontrib>HUANG XIANGDANG</creatorcontrib><creatorcontrib>FEI HANSHENG</creatorcontrib><creatorcontrib>YANG QIULING</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>HUANG XIWEN</au><au>HUANG XIANGDANG</au><au>FEI HANSHENG</au><au>YANG QIULING</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Vehicle optimal control method based on deep reinforcement learning</title><date>2024-07-23</date><risdate>2024</risdate><abstract>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | chi ; eng |
recordid | cdi_epo_espacenet_CN118372851A |
source | esp@cenet |
subjects | CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES CONTROLLING COUNTING PERFORMING OPERATIONS PHYSICS REGULATING ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES TRANSPORTING VEHICLES IN GENERAL |
title | Vehicle optimal control method based on deep reinforcement learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T01%3A16%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=HUANG%20XIWEN&rft.date=2024-07-23&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN118372851A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |