Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands

We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental d...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser:	Iklassov, Zangir, Sobirov, Ikboljon, Solozabal, Ruben, Takac, Martin
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Correlation Costs Environmental impact Heuristic methods Metaheuristics Noise levels Reinforcement learning Routing Stochastic processes Stopchastic Optimization Vehicle routing Vehicle Routing Problem
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE access
container_volume	11
creator	Iklassov, Zangir Sobirov, Ikboljon Solozabal, Ruben Takac, Martin
description	We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available online.
doi_str_mv	10.1109/ACCESS.2023.3306076
format	Article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2857105470</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10223206</ieee_id><doaj_id>oai_doaj_org_article_8106d37416784508906865d6f0a2a522</doaj_id><sourcerecordid>2857105470</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-add5fe019d48e53f989d99b45917b4b9e9d2b2a171ab9a18d20d5e88b43c39493</originalsourceid><addsrcrecordid>eNpNUd1LwzAQL6Lg0P0F-hDweTMfTZo8jjp1MFA29U1C2lzXjq6ZaYb439vaIbuXO47fx3G_KLoheEoIVvezNJ2v11OKKZsyhgVOxFk0okSoCeNMnJ_Ml9G4bbe4K9mteDKKPldQNYXzOeygCWgJxjdVs0Gz_d47k5coOLQOLi9NG6ocfUBZ5TWglTuEHvbqXVbDDn1XoUSp8x5qE8CiB9iZxrbX0UVh6hbGx34VvT_O39LnyfLlaZHOlpOccRUmxlpeACbKxhI4K5RUVqks5ookWZwpUJZm1JCEmEwZIi3FloOUWcxypmLFrqLFoGud2eq9r3bG_2hnKv23cH6jjQ_95VoSLCxLYiISGXMsFRZScCsKbKjhlHZad4NW94CvA7RBb93BN935mkqeEMzjBHcoNqBy79rWQ_HvSrDuY9FDLLqPRR9j6Vi3A6sCgBNGZ0uxYL-zvIfT</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2857105470</pqid></control><display><type>article</type><title>Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Iklassov, Zangir ; Sobirov, Ikboljon ; Solozabal, Ruben ; Takac, Martin</creator><creatorcontrib>Iklassov, Zangir ; Sobirov, Ikboljon ; Solozabal, Ruben ; Takac, Martin</creatorcontrib><description>We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available online.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2023.3306076</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Correlation ; Costs ; Environmental impact ; Heuristic methods ; Metaheuristics ; Noise levels ; Reinforcement learning ; Routing ; Stochastic processes ; Stopchastic Optimization ; Vehicle routing ; Vehicle Routing Problem</subject><ispartof>IEEE access, 2023-01, Vol.11, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c359t-add5fe019d48e53f989d99b45917b4b9e9d2b2a171ab9a18d20d5e88b43c39493</cites><orcidid>0000-0002-2835-990X ; 0000-0001-7455-2025 ; 0000-0002-0476-6359</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10223206$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,27610,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Iklassov, Zangir</creatorcontrib><creatorcontrib>Sobirov, Ikboljon</creatorcontrib><creatorcontrib>Solozabal, Ruben</creatorcontrib><creatorcontrib>Takac, Martin</creatorcontrib><title>Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands</title><title>IEEE access</title><addtitle>Access</addtitle><description>We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available online.</description><subject>Algorithms</subject><subject>Correlation</subject><subject>Costs</subject><subject>Environmental impact</subject><subject>Heuristic methods</subject><subject>Metaheuristics</subject><subject>Noise levels</subject><subject>Reinforcement learning</subject><subject>Routing</subject><subject>Stochastic processes</subject><subject>Stopchastic Optimization</subject><subject>Vehicle routing</subject><subject>Vehicle Routing Problem</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUd1LwzAQL6Lg0P0F-hDweTMfTZo8jjp1MFA29U1C2lzXjq6ZaYb439vaIbuXO47fx3G_KLoheEoIVvezNJ2v11OKKZsyhgVOxFk0okSoCeNMnJ_Ml9G4bbe4K9mteDKKPldQNYXzOeygCWgJxjdVs0Gz_d47k5coOLQOLi9NG6ocfUBZ5TWglTuEHvbqXVbDDn1XoUSp8x5qE8CiB9iZxrbX0UVh6hbGx34VvT_O39LnyfLlaZHOlpOccRUmxlpeACbKxhI4K5RUVqks5ookWZwpUJZm1JCEmEwZIi3FloOUWcxypmLFrqLFoGud2eq9r3bG_2hnKv23cH6jjQ_95VoSLCxLYiISGXMsFRZScCsKbKjhlHZad4NW94CvA7RBb93BN935mkqeEMzjBHcoNqBy79rWQ_HvSrDuY9FDLLqPRR9j6Vi3A6sCgBNGZ0uxYL-zvIfT</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Iklassov, Zangir</creator><creator>Sobirov, Ikboljon</creator><creator>Solozabal, Ruben</creator><creator>Takac, Martin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-2835-990X</orcidid><orcidid>https://orcid.org/0000-0001-7455-2025</orcidid><orcidid>https://orcid.org/0000-0002-0476-6359</orcidid></search><sort><creationdate>20230101</creationdate><title>Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands</title><author>Iklassov, Zangir ; Sobirov, Ikboljon ; Solozabal, Ruben ; Takac, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-add5fe019d48e53f989d99b45917b4b9e9d2b2a171ab9a18d20d5e88b43c39493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Correlation</topic><topic>Costs</topic><topic>Environmental impact</topic><topic>Heuristic methods</topic><topic>Metaheuristics</topic><topic>Noise levels</topic><topic>Reinforcement learning</topic><topic>Routing</topic><topic>Stochastic processes</topic><topic>Stopchastic Optimization</topic><topic>Vehicle routing</topic><topic>Vehicle Routing Problem</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iklassov, Zangir</creatorcontrib><creatorcontrib>Sobirov, Ikboljon</creatorcontrib><creatorcontrib>Solozabal, Ruben</creatorcontrib><creatorcontrib>Takac, Martin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Iklassov, Zangir</au><au>Sobirov, Ikboljon</au><au>Solozabal, Ruben</au><au>Takac, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2023-01-01</date><risdate>2023</risdate><volume>11</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the environment, such as the supply type, vehicle capacity, correlation, and noise levels of demand. Moreover, the model can be easily retrained for different VRPSD scenarios by observing the reward signals and following feasibility constraints, making it highly flexible and scalable. These findings highlight the potential of RL to enhance the transportation efficiency and mitigate its environmental impact in stochastic routing problems. Our implementation is available online.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2023.3306076</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-2835-990X</orcidid><orcidid>https://orcid.org/0000-0001-7455-2025</orcidid><orcidid>https://orcid.org/0000-0002-0476-6359</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2023-01, Vol.11, p.1-1
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2857105470
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects	Algorithms Correlation Costs Environmental impact Heuristic methods Metaheuristics Noise levels Reinforcement learning Routing Stochastic processes Stopchastic Optimization Vehicle routing Vehicle Routing Problem
title	Reinforcement Learning Approach to Stochastic Vehicle Routing Problem with Correlated Demands
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T05%3A37%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20Learning%20Approach%20to%20Stochastic%20Vehicle%20Routing%20Problem%20with%20Correlated%20Demands&rft.jtitle=IEEE%20access&rft.au=Iklassov,%20Zangir&rft.date=2023-01-01&rft.volume=11&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2023.3306076&rft_dat=%3Cproquest_ieee_%3E2857105470%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2857105470&rft_id=info:pmid/&rft_ieee_id=10223206&rft_doaj_id=oai_doaj_org_article_8106d37416784508906865d6f0a2a522&rfr_iscdi=true