Implications of stop-and-go traffic on training learning-based car-following control

Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research. Part C, Emerging technologies Emerging technologies, 2024-11, Vol.168, p.104578, Article 104578
Hauptverfasser: Zhou, Anye, Peeta, Srinivas, Zhou, Hao, Laval, Jorge, Wang, Zejiang, Cook, Adian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page 104578
container_title Transportation research. Part C, Emerging technologies
container_volume 168
creator Zhou, Anye
Peeta, Srinivas
Zhou, Hao
Laval, Jorge
Wang, Zejiang
Cook, Adian
description Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. This study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.
doi_str_mv 10.1016/j.trc.2024.104578
format Article
fullrecord <record><control><sourceid>elsevier_osti_</sourceid><recordid>TN_cdi_osti_scitechconnect_2438666</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0968090X24000998</els_id><sourcerecordid>S0968090X24000998</sourcerecordid><originalsourceid>FETCH-LOGICAL-c276t-22c00bb1c20a4a3ce2e102b7c94b34e5739385a934be3150cfaad168645c11c33</originalsourceid><addsrcrecordid>eNp9kE9LxDAQxXNQcF39AN6K96yTP01bPMmi7sKClxW8hXSarlm6yZIExW9vSz17mjfMe4_hR8gdgxUDph6OqxxxxYHLcZdlVV-QBTSqptDAxxW5TukIAKwpqwXZb0_nwaHJLvhUhL5IOZyp8R09hCJH0_cOi-An6bzzh2KwJk6CtibZrkATaR-GIXxPRww-xzDckMveDMne_s0leX953q83dPf2ul0_7SjySmXKOQK0LUMORhqBllsGvK2wka2QtqxEI-rSNEK2VrASsDemY6pWskTGUIgluZ97Q8pOJ3TZ4uf4g7eYNZeiVkqNJjabMIaUou31ObqTiT-agZ546aMeeemJl555jZnHOWPH77-cjVO59Wg7F6fuLrh_0r9rYHV6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Implications of stop-and-go traffic on training learning-based car-following control</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Zhou, Anye ; Peeta, Srinivas ; Zhou, Hao ; Laval, Jorge ; Wang, Zejiang ; Cook, Adian</creator><creatorcontrib>Zhou, Anye ; Peeta, Srinivas ; Zhou, Hao ; Laval, Jorge ; Wang, Zejiang ; Cook, Adian ; Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)</creatorcontrib><description>Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. This study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.</description><identifier>ISSN: 0968-090X</identifier><identifier>DOI: 10.1016/j.trc.2024.104578</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Behavior cloning ; Car-following control ; Deep reinforcement learning ; Generalizability ; System identification</subject><ispartof>Transportation research. Part C, Emerging technologies, 2024-11, Vol.168, p.104578, Article 104578</ispartof><rights>2024 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c276t-22c00bb1c20a4a3ce2e102b7c94b34e5739385a934be3150cfaad168645c11c33</cites><orcidid>0000-0002-0986-4046 ; 0000-0002-9184-5169 ; 0000-0003-0145-5579 ; 0000000160825395 ; 0000000301455579 ; 0000000291845169 ; 0000000209864046</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.trc.2024.104578$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3548,27923,27924,45994</link.rule.ids><backlink>$$Uhttps://www.osti.gov/biblio/2438666$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhou, Anye</creatorcontrib><creatorcontrib>Peeta, Srinivas</creatorcontrib><creatorcontrib>Zhou, Hao</creatorcontrib><creatorcontrib>Laval, Jorge</creatorcontrib><creatorcontrib>Wang, Zejiang</creatorcontrib><creatorcontrib>Cook, Adian</creatorcontrib><creatorcontrib>Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)</creatorcontrib><title>Implications of stop-and-go traffic on training learning-based car-following control</title><title>Transportation research. Part C, Emerging technologies</title><description>Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. This study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.</description><subject>Behavior cloning</subject><subject>Car-following control</subject><subject>Deep reinforcement learning</subject><subject>Generalizability</subject><subject>System identification</subject><issn>0968-090X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LxDAQxXNQcF39AN6K96yTP01bPMmi7sKClxW8hXSarlm6yZIExW9vSz17mjfMe4_hR8gdgxUDph6OqxxxxYHLcZdlVV-QBTSqptDAxxW5TukIAKwpqwXZb0_nwaHJLvhUhL5IOZyp8R09hCJH0_cOi-An6bzzh2KwJk6CtibZrkATaR-GIXxPRww-xzDckMveDMne_s0leX953q83dPf2ul0_7SjySmXKOQK0LUMORhqBllsGvK2wka2QtqxEI-rSNEK2VrASsDemY6pWskTGUIgluZ97Q8pOJ3TZ4uf4g7eYNZeiVkqNJjabMIaUou31ObqTiT-agZ546aMeeemJl555jZnHOWPH77-cjVO59Wg7F6fuLrh_0r9rYHV6</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Zhou, Anye</creator><creator>Peeta, Srinivas</creator><creator>Zhou, Hao</creator><creator>Laval, Jorge</creator><creator>Wang, Zejiang</creator><creator>Cook, Adian</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>AAYXX</scope><scope>CITATION</scope><scope>OTOTI</scope><orcidid>https://orcid.org/0000-0002-0986-4046</orcidid><orcidid>https://orcid.org/0000-0002-9184-5169</orcidid><orcidid>https://orcid.org/0000-0003-0145-5579</orcidid><orcidid>https://orcid.org/0000000160825395</orcidid><orcidid>https://orcid.org/0000000301455579</orcidid><orcidid>https://orcid.org/0000000291845169</orcidid><orcidid>https://orcid.org/0000000209864046</orcidid></search><sort><creationdate>20241101</creationdate><title>Implications of stop-and-go traffic on training learning-based car-following control</title><author>Zhou, Anye ; Peeta, Srinivas ; Zhou, Hao ; Laval, Jorge ; Wang, Zejiang ; Cook, Adian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c276t-22c00bb1c20a4a3ce2e102b7c94b34e5739385a934be3150cfaad168645c11c33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Behavior cloning</topic><topic>Car-following control</topic><topic>Deep reinforcement learning</topic><topic>Generalizability</topic><topic>System identification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Anye</creatorcontrib><creatorcontrib>Peeta, Srinivas</creatorcontrib><creatorcontrib>Zhou, Hao</creatorcontrib><creatorcontrib>Laval, Jorge</creatorcontrib><creatorcontrib>Wang, Zejiang</creatorcontrib><creatorcontrib>Cook, Adian</creatorcontrib><creatorcontrib>Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)</creatorcontrib><collection>CrossRef</collection><collection>OSTI.GOV</collection><jtitle>Transportation research. Part C, Emerging technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Anye</au><au>Peeta, Srinivas</au><au>Zhou, Hao</au><au>Laval, Jorge</au><au>Wang, Zejiang</au><au>Cook, Adian</au><aucorp>Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Implications of stop-and-go traffic on training learning-based car-following control</atitle><jtitle>Transportation research. Part C, Emerging technologies</jtitle><date>2024-11-01</date><risdate>2024</risdate><volume>168</volume><spage>104578</spage><pages>104578-</pages><artnum>104578</artnum><issn>0968-090X</issn><abstract>Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. This study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.trc.2024.104578</doi><orcidid>https://orcid.org/0000-0002-0986-4046</orcidid><orcidid>https://orcid.org/0000-0002-9184-5169</orcidid><orcidid>https://orcid.org/0000-0003-0145-5579</orcidid><orcidid>https://orcid.org/0000000160825395</orcidid><orcidid>https://orcid.org/0000000301455579</orcidid><orcidid>https://orcid.org/0000000291845169</orcidid><orcidid>https://orcid.org/0000000209864046</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0968-090X
ispartof Transportation research. Part C, Emerging technologies, 2024-11, Vol.168, p.104578, Article 104578
issn 0968-090X
language eng
recordid cdi_osti_scitechconnect_2438666
source ScienceDirect Journals (5 years ago - present)
subjects Behavior cloning
Car-following control
Deep reinforcement learning
Generalizability
System identification
title Implications of stop-and-go traffic on training learning-based car-following control
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T01%3A01%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_osti_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Implications%20of%20stop-and-go%20traffic%20on%20training%20learning-based%20car-following%20control&rft.jtitle=Transportation%20research.%20Part%20C,%20Emerging%20technologies&rft.au=Zhou,%20Anye&rft.aucorp=Oak%20Ridge%20National%20Laboratory%20(ORNL),%20Oak%20Ridge,%20TN%20(United%20States)&rft.date=2024-11-01&rft.volume=168&rft.spage=104578&rft.pages=104578-&rft.artnum=104578&rft.issn=0968-090X&rft_id=info:doi/10.1016/j.trc.2024.104578&rft_dat=%3Celsevier_osti_%3ES0968090X24000998%3C/elsevier_osti_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_els_id=S0968090X24000998&rfr_iscdi=true