Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving

General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-09
Hauptverfasser: Rosbach, Sascha, Vinit, James, Großjohann, Simon, Homoceanu, Silviu, Li, Xing, Roth, Stefan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Rosbach, Sascha
Vinit, James
Großjohann, Simon
Homoceanu, Silviu
Li, Xing
Roth, Stefan
description General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The specification and tuning of these reward functions is a tedious process and requires significant experience. Moreover, a manually designed linear reward function does not generalize across different driving situations. In this work, we propose a deep learning approach based on inverse reinforcement learning that generates situation-dependent reward functions. Our neural network provides a mapping between features and actions of sampled driving policies of a model-predictive control-based planner and predicts reward functions for upcoming planning cycles. In our evaluation, we compare the driving style of reward functions predicted by our deep network against clustered and linear reward functions. Our proposed deep learning approach outperforms clustered linear reward functions and is at par with linear reward functions with a-priori knowledge about the situation.
doi_str_mv 10.48550/arxiv.1912.03509
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_1912_03509</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2323284508</sourcerecordid><originalsourceid>FETCH-LOGICAL-a528-2161f1b07b454ac320eb5098b0f7f15632f46dc6b20bb5cc26fd6b9640d472a93</originalsourceid><addsrcrecordid>eNotkF1LwzAYhYMgOOZ-gFcGvO7Md1vvxpxTGDjc7kvSJJLRJTVNp_v3dh-8FwdeDg-cB4AHjKas4Bw9y_jnDlNcYjJFlKPyBowIpTgrGCF3YNJ1O4QQETnhnI6AfY3u4Pw33KRjY-DC10Gb-AI3LvUyueBlA7_Mr4wazrRs0_kHbYhwabyJssnWfWxDZ-C6kd6fSM7DWZ_CXiaj4RV_D26tbDozueYYbN8W2_l7tvpcfsxnq0xyUmQEC2yxQrlinMmaEmTUsKBQyOYWc0GJZULXQhGkFK9rIqwWqhQMaZYTWdIxeLxgzxKqNrq9jMfqJKM6yxgaT5dGG8NPb7pU7UIfh5VdRehwBeOooP-sx2Jc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2323284508</pqid></control><display><type>article</type><title>Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Rosbach, Sascha ; Vinit, James ; Großjohann, Simon ; Homoceanu, Silviu ; Li, Xing ; Roth, Stefan</creator><creatorcontrib>Rosbach, Sascha ; Vinit, James ; Großjohann, Simon ; Homoceanu, Silviu ; Li, Xing ; Roth, Stefan</creatorcontrib><description>General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The specification and tuning of these reward functions is a tedious process and requires significant experience. Moreover, a manually designed linear reward function does not generalize across different driving situations. In this work, we propose a deep learning approach based on inverse reinforcement learning that generates situation-dependent reward functions. Our neural network provides a mapping between features and actions of sampled driving policies of a model-predictive control-based planner and predicts reward functions for upcoming planning cycles. In our evaluation, we compare the driving style of reward functions predicted by our deep network against clustered and linear reward functions. Our proposed deep learning approach outperforms clustered linear reward functions and is at par with linear reward functions with a-priori knowledge about the situation.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1912.03509</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Automation ; Coders ; Computer Science - Artificial Intelligence ; Computer Science - Learning ; Computer Science - Robotics ; Deep learning ; Kinematics ; Machine learning ; Mapping ; Motion planning ; Neural networks ; Predictive control</subject><ispartof>arXiv.org, 2020-09</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.1912.03509$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/ICRA40945.2020.9196778$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Rosbach, Sascha</creatorcontrib><creatorcontrib>Vinit, James</creatorcontrib><creatorcontrib>Großjohann, Simon</creatorcontrib><creatorcontrib>Homoceanu, Silviu</creatorcontrib><creatorcontrib>Li, Xing</creatorcontrib><creatorcontrib>Roth, Stefan</creatorcontrib><title>Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving</title><title>arXiv.org</title><description>General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The specification and tuning of these reward functions is a tedious process and requires significant experience. Moreover, a manually designed linear reward function does not generalize across different driving situations. In this work, we propose a deep learning approach based on inverse reinforcement learning that generates situation-dependent reward functions. Our neural network provides a mapping between features and actions of sampled driving policies of a model-predictive control-based planner and predicts reward functions for upcoming planning cycles. In our evaluation, we compare the driving style of reward functions predicted by our deep network against clustered and linear reward functions. Our proposed deep learning approach outperforms clustered linear reward functions and is at par with linear reward functions with a-priori knowledge about the situation.</description><subject>Algorithms</subject><subject>Automation</subject><subject>Coders</subject><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Robotics</subject><subject>Deep learning</subject><subject>Kinematics</subject><subject>Machine learning</subject><subject>Mapping</subject><subject>Motion planning</subject><subject>Neural networks</subject><subject>Predictive control</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotkF1LwzAYhYMgOOZ-gFcGvO7Md1vvxpxTGDjc7kvSJJLRJTVNp_v3dh-8FwdeDg-cB4AHjKas4Bw9y_jnDlNcYjJFlKPyBowIpTgrGCF3YNJ1O4QQETnhnI6AfY3u4Pw33KRjY-DC10Gb-AI3LvUyueBlA7_Mr4wazrRs0_kHbYhwabyJssnWfWxDZ-C6kd6fSM7DWZ_CXiaj4RV_D26tbDozueYYbN8W2_l7tvpcfsxnq0xyUmQEC2yxQrlinMmaEmTUsKBQyOYWc0GJZULXQhGkFK9rIqwWqhQMaZYTWdIxeLxgzxKqNrq9jMfqJKM6yxgaT5dGG8NPb7pU7UIfh5VdRehwBeOooP-sx2Jc</recordid><startdate>20200913</startdate><enddate>20200913</enddate><creator>Rosbach, Sascha</creator><creator>Vinit, James</creator><creator>Großjohann, Simon</creator><creator>Homoceanu, Silviu</creator><creator>Li, Xing</creator><creator>Roth, Stefan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200913</creationdate><title>Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving</title><author>Rosbach, Sascha ; Vinit, James ; Großjohann, Simon ; Homoceanu, Silviu ; Li, Xing ; Roth, Stefan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a528-2161f1b07b454ac320eb5098b0f7f15632f46dc6b20bb5cc26fd6b9640d472a93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Automation</topic><topic>Coders</topic><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Robotics</topic><topic>Deep learning</topic><topic>Kinematics</topic><topic>Machine learning</topic><topic>Mapping</topic><topic>Motion planning</topic><topic>Neural networks</topic><topic>Predictive control</topic><toplevel>online_resources</toplevel><creatorcontrib>Rosbach, Sascha</creatorcontrib><creatorcontrib>Vinit, James</creatorcontrib><creatorcontrib>Großjohann, Simon</creatorcontrib><creatorcontrib>Homoceanu, Silviu</creatorcontrib><creatorcontrib>Li, Xing</creatorcontrib><creatorcontrib>Roth, Stefan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rosbach, Sascha</au><au>Vinit, James</au><au>Großjohann, Simon</au><au>Homoceanu, Silviu</au><au>Li, Xing</au><au>Roth, Stefan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving</atitle><jtitle>arXiv.org</jtitle><date>2020-09-13</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The specification and tuning of these reward functions is a tedious process and requires significant experience. Moreover, a manually designed linear reward function does not generalize across different driving situations. In this work, we propose a deep learning approach based on inverse reinforcement learning that generates situation-dependent reward functions. Our neural network provides a mapping between features and actions of sampled driving policies of a model-predictive control-based planner and predicts reward functions for upcoming planning cycles. In our evaluation, we compare the driving style of reward functions predicted by our deep network against clustered and linear reward functions. Our proposed deep learning approach outperforms clustered linear reward functions and is at par with linear reward functions with a-priori knowledge about the situation.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1912.03509</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2020-09
issn 2331-8422
language eng
recordid cdi_arxiv_primary_1912_03509
source arXiv.org; Free E- Journals
subjects Algorithms
Automation
Coders
Computer Science - Artificial Intelligence
Computer Science - Learning
Computer Science - Robotics
Deep learning
Kinematics
Machine learning
Mapping
Motion planning
Neural networks
Predictive control
title Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T08%3A11%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Driving%20Style%20Encoder:%20Situational%20Reward%20Adaptation%20for%20General-Purpose%20Planning%20in%20Automated%20Driving&rft.jtitle=arXiv.org&rft.au=Rosbach,%20Sascha&rft.date=2020-09-13&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1912.03509&rft_dat=%3Cproquest_arxiv%3E2323284508%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2323284508&rft_id=info:pmid/&rfr_iscdi=true