Vehicle optimal control method based on deep reinforcement learning

The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HUANG XIWEN, HUANG XIANGDANG, FEI HANSHENG, YANG QIULING
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator HUANG XIWEN
HUANG XIANGDANG
FEI HANSHENG
YANG QIULING
description The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN118372851A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN118372851A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN118372851A3</originalsourceid><addsrcrecordid>eNqNyjEOAiEQBVAaC6PeYTyABW6M2xqisbIythuEvy7JMEOA-8fGA1i95q2Ne2FJgUFaesqeKaj0qkwZfdFIb98QSYUiUKgiyaw1IEM6MXyVJJ-tWc2eG3Y_N2Z_uz7d_YCiE1rxAYI-uYe143A-jid7Gf45X445MqU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Vehicle optimal control method based on deep reinforcement learning</title><source>esp@cenet</source><creator>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</creator><creatorcontrib>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</creatorcontrib><description>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION ; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES ; CONTROLLING ; COUNTING ; PERFORMING OPERATIONS ; PHYSICS ; REGULATING ; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT ; SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES ; TRANSPORTING ; VEHICLES IN GENERAL</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240723&amp;DB=EPODOC&amp;CC=CN&amp;NR=118372851A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76318</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240723&amp;DB=EPODOC&amp;CC=CN&amp;NR=118372851A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>HUANG XIWEN</creatorcontrib><creatorcontrib>HUANG XIANGDANG</creatorcontrib><creatorcontrib>FEI HANSHENG</creatorcontrib><creatorcontrib>YANG QIULING</creatorcontrib><title>Vehicle optimal control method based on deep reinforcement learning</title><description>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION</subject><subject>CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES</subject><subject>CONTROLLING</subject><subject>COUNTING</subject><subject>PERFORMING OPERATIONS</subject><subject>PHYSICS</subject><subject>REGULATING</subject><subject>ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT</subject><subject>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</subject><subject>TRANSPORTING</subject><subject>VEHICLES IN GENERAL</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyjEOAiEQBVAaC6PeYTyABW6M2xqisbIythuEvy7JMEOA-8fGA1i95q2Ne2FJgUFaesqeKaj0qkwZfdFIb98QSYUiUKgiyaw1IEM6MXyVJJ-tWc2eG3Y_N2Z_uz7d_YCiE1rxAYI-uYe143A-jid7Gf45X445MqU</recordid><startdate>20240723</startdate><enddate>20240723</enddate><creator>HUANG XIWEN</creator><creator>HUANG XIANGDANG</creator><creator>FEI HANSHENG</creator><creator>YANG QIULING</creator><scope>EVB</scope></search><sort><creationdate>20240723</creationdate><title>Vehicle optimal control method based on deep reinforcement learning</title><author>HUANG XIWEN ; HUANG XIANGDANG ; FEI HANSHENG ; YANG QIULING</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN118372851A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION</topic><topic>CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES</topic><topic>CONTROLLING</topic><topic>COUNTING</topic><topic>PERFORMING OPERATIONS</topic><topic>PHYSICS</topic><topic>REGULATING</topic><topic>ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT</topic><topic>SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES</topic><topic>TRANSPORTING</topic><topic>VEHICLES IN GENERAL</topic><toplevel>online_resources</toplevel><creatorcontrib>HUANG XIWEN</creatorcontrib><creatorcontrib>HUANG XIANGDANG</creatorcontrib><creatorcontrib>FEI HANSHENG</creatorcontrib><creatorcontrib>YANG QIULING</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>HUANG XIWEN</au><au>HUANG XIANGDANG</au><au>FEI HANSHENG</au><au>YANG QIULING</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Vehicle optimal control method based on deep reinforcement learning</title><date>2024-07-23</date><risdate>2024</risdate><abstract>The invention discloses a vehicle optimal control method based on deep reinforcement learning. The method comprises the following steps: step 1, establishing a strategy network and a mutually independent value network; 2, the vehicle is controlled to run, and samples are collected; 3, inputting the data st and at into a value network to obtain two value scores, and calculating a prediction score by taking the smaller value; inputting the state st + 1 into the strategy network to obtain an action at + 1, respectively inputting the data st + 1 and at + 1 into two value scores in the two value networks, determining a TD error according to the value scores and a prediction score, and updating the value networks; 4, updating the strategy network after the value network is updated twice; and step 5, repeating the steps 2-4 to perform network parameter tuning until the policy network achieves an expected effect, and outputting the finally updated policy network. The stability can be ensured in the process of optimiz</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN118372851A
source esp@cenet
subjects CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE ORDIFFERENT FUNCTION
CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES
CONTROLLING
COUNTING
PERFORMING OPERATIONS
PHYSICS
REGULATING
ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TOTHE CONTROL OF A PARTICULAR SUB-UNIT
SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
TRANSPORTING
VEHICLES IN GENERAL
title Vehicle optimal control method based on deep reinforcement learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T01%3A16%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=HUANG%20XIWEN&rft.date=2024-07-23&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN118372851A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true