A deep reinforcement learning approach for the meal delivery problem

We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier’s duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2022-05, Vol.243, p.108489, Article 108489
Hauptverfasser: Jahanshahi, Hadi, Bozanta, Aysun, Cevik, Mucahit, Kavuk, Eray Mert, Tosun, Ayşe, Sonuc, Sibel B., Kosucu, Bilgin, Başar, Ayşe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page 108489
container_title Knowledge-based systems
container_volume 243
creator Jahanshahi, Hadi
Bozanta, Aysun
Cevik, Mucahit
Kavuk, Eray Mert
Tosun, Ayşe
Sonuc, Sibel B.
Kosucu, Bilgin
Başar, Ayşe
description We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier’s duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation. •Our MDP model for the courier assignment task characterizes on-demand meal delivery service.•We tailor deep reinforcement learning algorithms to address the problem in a dynamic environment.•We incorporate the notion of order rejection to reduce the number of late orders.•We investigate the importance of intelligent repositioning of the couriers during their idle times.
doi_str_mv 10.1016/j.knosys.2022.108489
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2655163862</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0950705122002088</els_id><sourcerecordid>2655163862</sourcerecordid><originalsourceid>FETCH-LOGICAL-c334t-384aa7475e7651d15dc5f8b52f4dd37c64c9ed564bef44ff7f866e08775370913</originalsourceid><addsrcrecordid>eNp9kEtLAzEQx4MoWKvfwEPA89ZkN6-9CKU-oeBFzyFNJjbrPmqyLfTbm7KePQ3M_B_MD6FbShaUUHHfLL77IR3ToiRlmVeKqfoMzaiSZSEZqc_RjNScFJJweomuUmoIyUqqZuhxiR3ADkcIvR-ihQ76EbdgYh_6L2x2uzgYu8X5hsct4A5Mmx1tOEA84nzctNBdowtv2gQ3f3OOPp-fPlavxfr95W21XBe2qthYVIoZI5nkIAWnjnJnuVcbXnrmXCWtYLYGxwXbgGfMe-mVEECUlLySpKbVHN1Nubn3Zw9p1M2wj32u1KXgnIpKiTKr2KSycUgpgte7GDoTj5oSfeKlGz3x0ideeuKVbQ-TDfIHhwBRJxugt-BCBDtqN4T_A34BtGV1Ow</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2655163862</pqid></control><display><type>article</type><title>A deep reinforcement learning approach for the meal delivery problem</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Jahanshahi, Hadi ; Bozanta, Aysun ; Cevik, Mucahit ; Kavuk, Eray Mert ; Tosun, Ayşe ; Sonuc, Sibel B. ; Kosucu, Bilgin ; Başar, Ayşe</creator><creatorcontrib>Jahanshahi, Hadi ; Bozanta, Aysun ; Cevik, Mucahit ; Kavuk, Eray Mert ; Tosun, Ayşe ; Sonuc, Sibel B. ; Kosucu, Bilgin ; Başar, Ayşe</creatorcontrib><description>We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier’s duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation. •Our MDP model for the courier assignment task characterizes on-demand meal delivery service.•We tailor deep reinforcement learning algorithms to address the problem in a dynamic environment.•We incorporate the notion of order rejection to reduce the number of late orders.•We investigate the importance of intelligent repositioning of the couriers during their idle times.</description><identifier>ISSN: 0950-7051</identifier><identifier>EISSN: 1872-7409</identifier><identifier>DOI: 10.1016/j.knosys.2022.108489</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Courier assignment ; Customer services ; Customers ; DDQN ; Deep learning ; Delivery services ; DQN ; Geographical locations ; Markov processes ; Meal delivery ; Policies ; Reinforcement learning ; Restaurants ; Robustness (mathematics)</subject><ispartof>Knowledge-based systems, 2022-05, Vol.243, p.108489, Article 108489</ispartof><rights>2022 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. May 11, 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c334t-384aa7475e7651d15dc5f8b52f4dd37c64c9ed564bef44ff7f866e08775370913</citedby><cites>FETCH-LOGICAL-c334t-384aa7475e7651d15dc5f8b52f4dd37c64c9ed564bef44ff7f866e08775370913</cites><orcidid>0000-0003-4934-8326 ; 0000-0003-4020-6305 ; 0000-0003-1859-7872 ; 0000-0001-7248-6263 ; 0000-0002-5569-3938 ; 0000-0002-1768-6278</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.knosys.2022.108489$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3548,27923,27924,45994</link.rule.ids></links><search><creatorcontrib>Jahanshahi, Hadi</creatorcontrib><creatorcontrib>Bozanta, Aysun</creatorcontrib><creatorcontrib>Cevik, Mucahit</creatorcontrib><creatorcontrib>Kavuk, Eray Mert</creatorcontrib><creatorcontrib>Tosun, Ayşe</creatorcontrib><creatorcontrib>Sonuc, Sibel B.</creatorcontrib><creatorcontrib>Kosucu, Bilgin</creatorcontrib><creatorcontrib>Başar, Ayşe</creatorcontrib><title>A deep reinforcement learning approach for the meal delivery problem</title><title>Knowledge-based systems</title><description>We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier’s duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation. •Our MDP model for the courier assignment task characterizes on-demand meal delivery service.•We tailor deep reinforcement learning algorithms to address the problem in a dynamic environment.•We incorporate the notion of order rejection to reduce the number of late orders.•We investigate the importance of intelligent repositioning of the couriers during their idle times.</description><subject>Courier assignment</subject><subject>Customer services</subject><subject>Customers</subject><subject>DDQN</subject><subject>Deep learning</subject><subject>Delivery services</subject><subject>DQN</subject><subject>Geographical locations</subject><subject>Markov processes</subject><subject>Meal delivery</subject><subject>Policies</subject><subject>Reinforcement learning</subject><subject>Restaurants</subject><subject>Robustness (mathematics)</subject><issn>0950-7051</issn><issn>1872-7409</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEQx4MoWKvfwEPA89ZkN6-9CKU-oeBFzyFNJjbrPmqyLfTbm7KePQ3M_B_MD6FbShaUUHHfLL77IR3ToiRlmVeKqfoMzaiSZSEZqc_RjNScFJJweomuUmoIyUqqZuhxiR3ADkcIvR-ihQ76EbdgYh_6L2x2uzgYu8X5hsct4A5Mmx1tOEA84nzctNBdowtv2gQ3f3OOPp-fPlavxfr95W21XBe2qthYVIoZI5nkIAWnjnJnuVcbXnrmXCWtYLYGxwXbgGfMe-mVEECUlLySpKbVHN1Nubn3Zw9p1M2wj32u1KXgnIpKiTKr2KSycUgpgte7GDoTj5oSfeKlGz3x0ideeuKVbQ-TDfIHhwBRJxugt-BCBDtqN4T_A34BtGV1Ow</recordid><startdate>20220511</startdate><enddate>20220511</enddate><creator>Jahanshahi, Hadi</creator><creator>Bozanta, Aysun</creator><creator>Cevik, Mucahit</creator><creator>Kavuk, Eray Mert</creator><creator>Tosun, Ayşe</creator><creator>Sonuc, Sibel B.</creator><creator>Kosucu, Bilgin</creator><creator>Başar, Ayşe</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4934-8326</orcidid><orcidid>https://orcid.org/0000-0003-4020-6305</orcidid><orcidid>https://orcid.org/0000-0003-1859-7872</orcidid><orcidid>https://orcid.org/0000-0001-7248-6263</orcidid><orcidid>https://orcid.org/0000-0002-5569-3938</orcidid><orcidid>https://orcid.org/0000-0002-1768-6278</orcidid></search><sort><creationdate>20220511</creationdate><title>A deep reinforcement learning approach for the meal delivery problem</title><author>Jahanshahi, Hadi ; Bozanta, Aysun ; Cevik, Mucahit ; Kavuk, Eray Mert ; Tosun, Ayşe ; Sonuc, Sibel B. ; Kosucu, Bilgin ; Başar, Ayşe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c334t-384aa7475e7651d15dc5f8b52f4dd37c64c9ed564bef44ff7f866e08775370913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Courier assignment</topic><topic>Customer services</topic><topic>Customers</topic><topic>DDQN</topic><topic>Deep learning</topic><topic>Delivery services</topic><topic>DQN</topic><topic>Geographical locations</topic><topic>Markov processes</topic><topic>Meal delivery</topic><topic>Policies</topic><topic>Reinforcement learning</topic><topic>Restaurants</topic><topic>Robustness (mathematics)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jahanshahi, Hadi</creatorcontrib><creatorcontrib>Bozanta, Aysun</creatorcontrib><creatorcontrib>Cevik, Mucahit</creatorcontrib><creatorcontrib>Kavuk, Eray Mert</creatorcontrib><creatorcontrib>Tosun, Ayşe</creatorcontrib><creatorcontrib>Sonuc, Sibel B.</creatorcontrib><creatorcontrib>Kosucu, Bilgin</creatorcontrib><creatorcontrib>Başar, Ayşe</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Knowledge-based systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jahanshahi, Hadi</au><au>Bozanta, Aysun</au><au>Cevik, Mucahit</au><au>Kavuk, Eray Mert</au><au>Tosun, Ayşe</au><au>Sonuc, Sibel B.</au><au>Kosucu, Bilgin</au><au>Başar, Ayşe</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A deep reinforcement learning approach for the meal delivery problem</atitle><jtitle>Knowledge-based systems</jtitle><date>2022-05-11</date><risdate>2022</risdate><volume>243</volume><spage>108489</spage><pages>108489-</pages><artnum>108489</artnum><issn>0950-7051</issn><eissn>1872-7409</eissn><abstract>We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier’s duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation. •Our MDP model for the courier assignment task characterizes on-demand meal delivery service.•We tailor deep reinforcement learning algorithms to address the problem in a dynamic environment.•We incorporate the notion of order rejection to reduce the number of late orders.•We investigate the importance of intelligent repositioning of the couriers during their idle times.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.knosys.2022.108489</doi><orcidid>https://orcid.org/0000-0003-4934-8326</orcidid><orcidid>https://orcid.org/0000-0003-4020-6305</orcidid><orcidid>https://orcid.org/0000-0003-1859-7872</orcidid><orcidid>https://orcid.org/0000-0001-7248-6263</orcidid><orcidid>https://orcid.org/0000-0002-5569-3938</orcidid><orcidid>https://orcid.org/0000-0002-1768-6278</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0950-7051
ispartof Knowledge-based systems, 2022-05, Vol.243, p.108489, Article 108489
issn 0950-7051
1872-7409
language eng
recordid cdi_proquest_journals_2655163862
source ScienceDirect Journals (5 years ago - present)
subjects Courier assignment
Customer services
Customers
DDQN
Deep learning
Delivery services
DQN
Geographical locations
Markov processes
Meal delivery
Policies
Reinforcement learning
Restaurants
Robustness (mathematics)
title A deep reinforcement learning approach for the meal delivery problem
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T23%3A23%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20deep%20reinforcement%20learning%20approach%20for%20the%20meal%20delivery%20problem&rft.jtitle=Knowledge-based%20systems&rft.au=Jahanshahi,%20Hadi&rft.date=2022-05-11&rft.volume=243&rft.spage=108489&rft.pages=108489-&rft.artnum=108489&rft.issn=0950-7051&rft.eissn=1872-7409&rft_id=info:doi/10.1016/j.knosys.2022.108489&rft_dat=%3Cproquest_cross%3E2655163862%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2655163862&rft_id=info:pmid/&rft_els_id=S0950705122002088&rfr_iscdi=true