An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching

Balancing the supply and demand for ride-sourcing companies is a challenging issue, especially with real-time requests and stochastic traffic conditions of large-scale congested road networks. To tackle this challenge, this article proposes a robust and scalable approach that integrates reinforcemen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2022-09, Vol.33 (9), p.4742-4756
Hauptverfasser: Liang, Enming, Wen, Kexin, Lam, William H. K., Sumalee, Agachai, Zhong, Renxin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4756
container_issue 9
container_start_page 4742
container_title IEEE transaction on neural networks and learning systems
container_volume 33
creator Liang, Enming
Wen, Kexin
Lam, William H. K.
Sumalee, Agachai
Zhong, Renxin
description Balancing the supply and demand for ride-sourcing companies is a challenging issue, especially with real-time requests and stochastic traffic conditions of large-scale congested road networks. To tackle this challenge, this article proposes a robust and scalable approach that integrates reinforcement learning (RL) and a centralized programming (CP) structure to promote real-time taxi operations. Both real-time order matching decisions and vehicle relocation decisions at the microscopic network scale are integrated within a Markov decision process framework. The RL component learns the decomposed state-value function, which represents the taxi drivers' experience, the off-line historical demand pattern, and the traffic network congestion. The CP component plans nonmyopic decisions for drivers collectively under the prescribed system constraints to explicitly realize cooperation. Furthermore, to circumvent sparse reward and sample imbalance problems over the microscopic road network, this article proposed a temporal-difference learning algorithm with prioritized gradient descent and adaptive exploration techniques. A simulator is built and trained with the Manhattan road network and New York City yellow taxi data to simulate the real-time vehicle dispatching environment. Both centralized and decentralized taxi dispatching policies are examined with the simulator. This case study shows that the proposed approach can further improve taxi drivers' profits while reducing customers' waiting times compared to several existing vehicle dispatching algorithms.
doi_str_mv 10.1109/TNNLS.2021.3060187
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pubmed_primary_33651702</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9366995</ieee_id><sourcerecordid>2708643690</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-2030056bce32dc795deac45913a568590e63db24608f6f98e9325e2bc0fc04f83</originalsourceid><addsrcrecordid>eNpdkUFrGzEQhUVIaUKaP9BAEeTSi92RtNJKR-M0TcAkpXWhNyFrZxOFXa0rraHNr68cOz5kLhpmvjc89Aj5yGDKGJgvy7u7xc8pB86mAhQwXR-RU84Un3Ch9fGhr3-fkPOcn6CUAqkq856cCKEkq4GfEpxFehtHfEhuxIb-wBDbIXnsMY50gS7FEB-oiw2dl0lyXXgu2Pc0FEHfb3ez9ToNzj_SoqP3sQsR6dL9DfQq5LUb_WOBPpB3resynu_fM_Lr-utyfjNZ3H-7nc8WEy-MHCccBBSLK4-CN742skHnK2mYcFJpaQCVaFa8UqBb1RqNRnCJfOWh9VC1WpyRz7u7xdKfDebR9iF77DoXcdhkyyujuORayoJevkGfhk2KxZ3lNWhVCWWgUHxH-TTknLC16xR6l_5ZBnabg33JwW5zsPsciujT_vRm1WNzkLz-egEudkBAxMPaCKWMkeI_5zeLpA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2708643690</pqid></control><display><type>article</type><title>An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching</title><source>IEEE Electronic Library (IEL)</source><creator>Liang, Enming ; Wen, Kexin ; Lam, William H. K. ; Sumalee, Agachai ; Zhong, Renxin</creator><creatorcontrib>Liang, Enming ; Wen, Kexin ; Lam, William H. K. ; Sumalee, Agachai ; Zhong, Renxin</creatorcontrib><description>Balancing the supply and demand for ride-sourcing companies is a challenging issue, especially with real-time requests and stochastic traffic conditions of large-scale congested road networks. To tackle this challenge, this article proposes a robust and scalable approach that integrates reinforcement learning (RL) and a centralized programming (CP) structure to promote real-time taxi operations. Both real-time order matching decisions and vehicle relocation decisions at the microscopic network scale are integrated within a Markov decision process framework. The RL component learns the decomposed state-value function, which represents the taxi drivers' experience, the off-line historical demand pattern, and the traffic network congestion. The CP component plans nonmyopic decisions for drivers collectively under the prescribed system constraints to explicitly realize cooperation. Furthermore, to circumvent sparse reward and sample imbalance problems over the microscopic road network, this article proposed a temporal-difference learning algorithm with prioritized gradient descent and adaptive exploration techniques. A simulator is built and trained with the Manhattan road network and New York City yellow taxi data to simulate the real-time vehicle dispatching environment. Both centralized and decentralized taxi dispatching policies are examined with the simulator. This case study shows that the proposed approach can further improve taxi drivers' profits while reducing customers' waiting times compared to several existing vehicle dispatching algorithms.</description><identifier>ISSN: 2162-237X</identifier><identifier>EISSN: 2162-2388</identifier><identifier>DOI: 10.1109/TNNLS.2021.3060187</identifier><identifier>PMID: 33651702</identifier><identifier>CODEN: ITNNAL</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Car sharing ; Decisions ; Deep reinforcement learning (RL) ; Dispatching ; Driving conditions ; Machine learning ; Markov processes ; multiagent system ; online vehicle routing ; Programming ; Public transportation ; Real time operation ; Real-time systems ; Reinforcement ; Relocation ; Roads ; Simulation ; stochastic network traffic ; Stochasticity ; Taxicabs ; Traffic congestion ; Traffic planning ; vehicle dispatching ; Vehicle dynamics ; Vehicles</subject><ispartof>IEEE transaction on neural networks and learning systems, 2022-09, Vol.33 (9), p.4742-4756</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-2030056bce32dc795deac45913a568590e63db24608f6f98e9325e2bc0fc04f83</citedby><cites>FETCH-LOGICAL-c395t-2030056bce32dc795deac45913a568590e63db24608f6f98e9325e2bc0fc04f83</cites><orcidid>0000-0003-1559-7287 ; 0000-0001-6648-1255 ; 0000-0002-7625-3712 ; 0000-0003-0283-0676</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9366995$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9366995$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33651702$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liang, Enming</creatorcontrib><creatorcontrib>Wen, Kexin</creatorcontrib><creatorcontrib>Lam, William H. K.</creatorcontrib><creatorcontrib>Sumalee, Agachai</creatorcontrib><creatorcontrib>Zhong, Renxin</creatorcontrib><title>An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching</title><title>IEEE transaction on neural networks and learning systems</title><addtitle>TNNLS</addtitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><description>Balancing the supply and demand for ride-sourcing companies is a challenging issue, especially with real-time requests and stochastic traffic conditions of large-scale congested road networks. To tackle this challenge, this article proposes a robust and scalable approach that integrates reinforcement learning (RL) and a centralized programming (CP) structure to promote real-time taxi operations. Both real-time order matching decisions and vehicle relocation decisions at the microscopic network scale are integrated within a Markov decision process framework. The RL component learns the decomposed state-value function, which represents the taxi drivers' experience, the off-line historical demand pattern, and the traffic network congestion. The CP component plans nonmyopic decisions for drivers collectively under the prescribed system constraints to explicitly realize cooperation. Furthermore, to circumvent sparse reward and sample imbalance problems over the microscopic road network, this article proposed a temporal-difference learning algorithm with prioritized gradient descent and adaptive exploration techniques. A simulator is built and trained with the Manhattan road network and New York City yellow taxi data to simulate the real-time vehicle dispatching environment. Both centralized and decentralized taxi dispatching policies are examined with the simulator. This case study shows that the proposed approach can further improve taxi drivers' profits while reducing customers' waiting times compared to several existing vehicle dispatching algorithms.</description><subject>Algorithms</subject><subject>Car sharing</subject><subject>Decisions</subject><subject>Deep reinforcement learning (RL)</subject><subject>Dispatching</subject><subject>Driving conditions</subject><subject>Machine learning</subject><subject>Markov processes</subject><subject>multiagent system</subject><subject>online vehicle routing</subject><subject>Programming</subject><subject>Public transportation</subject><subject>Real time operation</subject><subject>Real-time systems</subject><subject>Reinforcement</subject><subject>Relocation</subject><subject>Roads</subject><subject>Simulation</subject><subject>stochastic network traffic</subject><subject>Stochasticity</subject><subject>Taxicabs</subject><subject>Traffic congestion</subject><subject>Traffic planning</subject><subject>vehicle dispatching</subject><subject>Vehicle dynamics</subject><subject>Vehicles</subject><issn>2162-237X</issn><issn>2162-2388</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkUFrGzEQhUVIaUKaP9BAEeTSi92RtNJKR-M0TcAkpXWhNyFrZxOFXa0rraHNr68cOz5kLhpmvjc89Aj5yGDKGJgvy7u7xc8pB86mAhQwXR-RU84Un3Ch9fGhr3-fkPOcn6CUAqkq856cCKEkq4GfEpxFehtHfEhuxIb-wBDbIXnsMY50gS7FEB-oiw2dl0lyXXgu2Pc0FEHfb3ez9ToNzj_SoqP3sQsR6dL9DfQq5LUb_WOBPpB3resynu_fM_Lr-utyfjNZ3H-7nc8WEy-MHCccBBSLK4-CN742skHnK2mYcFJpaQCVaFa8UqBb1RqNRnCJfOWh9VC1WpyRz7u7xdKfDebR9iF77DoXcdhkyyujuORayoJevkGfhk2KxZ3lNWhVCWWgUHxH-TTknLC16xR6l_5ZBnabg33JwW5zsPsciujT_vRm1WNzkLz-egEudkBAxMPaCKWMkeI_5zeLpA</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Liang, Enming</creator><creator>Wen, Kexin</creator><creator>Lam, William H. K.</creator><creator>Sumalee, Agachai</creator><creator>Zhong, Renxin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QP</scope><scope>7QQ</scope><scope>7QR</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TK</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-1559-7287</orcidid><orcidid>https://orcid.org/0000-0001-6648-1255</orcidid><orcidid>https://orcid.org/0000-0002-7625-3712</orcidid><orcidid>https://orcid.org/0000-0003-0283-0676</orcidid></search><sort><creationdate>20220901</creationdate><title>An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching</title><author>Liang, Enming ; Wen, Kexin ; Lam, William H. K. ; Sumalee, Agachai ; Zhong, Renxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-2030056bce32dc795deac45913a568590e63db24608f6f98e9325e2bc0fc04f83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Car sharing</topic><topic>Decisions</topic><topic>Deep reinforcement learning (RL)</topic><topic>Dispatching</topic><topic>Driving conditions</topic><topic>Machine learning</topic><topic>Markov processes</topic><topic>multiagent system</topic><topic>online vehicle routing</topic><topic>Programming</topic><topic>Public transportation</topic><topic>Real time operation</topic><topic>Real-time systems</topic><topic>Reinforcement</topic><topic>Relocation</topic><topic>Roads</topic><topic>Simulation</topic><topic>stochastic network traffic</topic><topic>Stochasticity</topic><topic>Taxicabs</topic><topic>Traffic congestion</topic><topic>Traffic planning</topic><topic>vehicle dispatching</topic><topic>Vehicle dynamics</topic><topic>Vehicles</topic><toplevel>online_resources</toplevel><creatorcontrib>Liang, Enming</creatorcontrib><creatorcontrib>Wen, Kexin</creatorcontrib><creatorcontrib>Lam, William H. K.</creatorcontrib><creatorcontrib>Sumalee, Agachai</creatorcontrib><creatorcontrib>Zhong, Renxin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transaction on neural networks and learning systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liang, Enming</au><au>Wen, Kexin</au><au>Lam, William H. K.</au><au>Sumalee, Agachai</au><au>Zhong, Renxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching</atitle><jtitle>IEEE transaction on neural networks and learning systems</jtitle><stitle>TNNLS</stitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><date>2022-09-01</date><risdate>2022</risdate><volume>33</volume><issue>9</issue><spage>4742</spage><epage>4756</epage><pages>4742-4756</pages><issn>2162-237X</issn><eissn>2162-2388</eissn><coden>ITNNAL</coden><abstract>Balancing the supply and demand for ride-sourcing companies is a challenging issue, especially with real-time requests and stochastic traffic conditions of large-scale congested road networks. To tackle this challenge, this article proposes a robust and scalable approach that integrates reinforcement learning (RL) and a centralized programming (CP) structure to promote real-time taxi operations. Both real-time order matching decisions and vehicle relocation decisions at the microscopic network scale are integrated within a Markov decision process framework. The RL component learns the decomposed state-value function, which represents the taxi drivers' experience, the off-line historical demand pattern, and the traffic network congestion. The CP component plans nonmyopic decisions for drivers collectively under the prescribed system constraints to explicitly realize cooperation. Furthermore, to circumvent sparse reward and sample imbalance problems over the microscopic road network, this article proposed a temporal-difference learning algorithm with prioritized gradient descent and adaptive exploration techniques. A simulator is built and trained with the Manhattan road network and New York City yellow taxi data to simulate the real-time vehicle dispatching environment. Both centralized and decentralized taxi dispatching policies are examined with the simulator. This case study shows that the proposed approach can further improve taxi drivers' profits while reducing customers' waiting times compared to several existing vehicle dispatching algorithms.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33651702</pmid><doi>10.1109/TNNLS.2021.3060187</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1559-7287</orcidid><orcidid>https://orcid.org/0000-0001-6648-1255</orcidid><orcidid>https://orcid.org/0000-0002-7625-3712</orcidid><orcidid>https://orcid.org/0000-0003-0283-0676</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2162-237X
ispartof IEEE transaction on neural networks and learning systems, 2022-09, Vol.33 (9), p.4742-4756
issn 2162-237X
2162-2388
language eng
recordid cdi_pubmed_primary_33651702
source IEEE Electronic Library (IEL)
subjects Algorithms
Car sharing
Decisions
Deep reinforcement learning (RL)
Dispatching
Driving conditions
Machine learning
Markov processes
multiagent system
online vehicle routing
Programming
Public transportation
Real time operation
Real-time systems
Reinforcement
Relocation
Roads
Simulation
stochastic network traffic
Stochasticity
Taxicabs
Traffic congestion
Traffic planning
vehicle dispatching
Vehicle dynamics
Vehicles
title An Integrated Reinforcement Learning and Centralized Programming Approach for Online Taxi Dispatching
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T04%3A13%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Integrated%20Reinforcement%20Learning%20and%20Centralized%20Programming%20Approach%20for%20Online%20Taxi%20Dispatching&rft.jtitle=IEEE%20transaction%20on%20neural%20networks%20and%20learning%20systems&rft.au=Liang,%20Enming&rft.date=2022-09-01&rft.volume=33&rft.issue=9&rft.spage=4742&rft.epage=4756&rft.pages=4742-4756&rft.issn=2162-237X&rft.eissn=2162-2388&rft.coden=ITNNAL&rft_id=info:doi/10.1109/TNNLS.2021.3060187&rft_dat=%3Cproquest_RIE%3E2708643690%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2708643690&rft_id=info:pmid/33651702&rft_ieee_id=9366995&rfr_iscdi=true