A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone

Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bogyrbayeva, Aigerim, Yoon, Taehyun, Ko, Hanbum, Lim, Sungbin, Yun, Hyokun, Kwon, Changhyun
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning Mathematics - Optimization and Control
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Bogyrbayeva, Aigerim Yoon, Taehyun Ko, Hanbum Lim, Sungbin Yun, Hyokun Kwon, Changhyun
description	Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination -- a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose a hybrid model that uses an attention encoder and a Long Short-Term Memory (LSTM) network decoder, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for the coordinated routing of multiple vehicles than the attention-based model. The proposed model demonstrates comparable results as the operations research baseline methods.
doi_str_mv	10.48550/arxiv.2112.12545
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2112_12545</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2112_12545</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-4bdad6f8cc72aaf1fb5cda811eba7de72215401f2be05e76158d739cb2a4bde13</originalsourceid><addsrcrecordid>eNotj09PhDAUxHvxYFY_gCf7BUBeoRSPZNd_CYnG5bgJeS2vCwm0pEtQv73L6mkyM5lJfozdQRJnhZTJA4bvfokFgIhByExes0PJd0QT_6TeWR8MjeRmXhEG17sjL6cpeDQdP3d874dlDeeOeB1woWF1exzoNKLjH8HrgUb-1c8d3wXv6IZdWRxOdPuvG1Y_P9Xb16h6f3nbllWEuZJRpltsc1sYowSiBaulabEAII2qJSUEyCwBKzQlklQOsmhV-mi0wPOUIN2w-7_bC14zhX7E8NOsmM0FM_0Fg1NOyA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone</title><source>arXiv.org</source><creator>Bogyrbayeva, Aigerim ; Yoon, Taehyun ; Ko, Hanbum ; Lim, Sungbin ; Yun, Hyokun ; Kwon, Changhyun</creator><creatorcontrib>Bogyrbayeva, Aigerim ; Yoon, Taehyun ; Ko, Hanbum ; Lim, Sungbin ; Yun, Hyokun ; Kwon, Changhyun</creatorcontrib><description>Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination -- a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose a hybrid model that uses an attention encoder and a Long Short-Term Memory (LSTM) network decoder, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for the coordinated routing of multiple vehicles than the attention-based model. The proposed model demonstrates comparable results as the operations research baseline methods.</description><identifier>DOI: 10.48550/arxiv.2112.12545</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning ; Mathematics - Optimization and Control</subject><creationdate>2021-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2112.12545$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2112.12545$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bogyrbayeva, Aigerim</creatorcontrib><creatorcontrib>Yoon, Taehyun</creatorcontrib><creatorcontrib>Ko, Hanbum</creatorcontrib><creatorcontrib>Lim, Sungbin</creatorcontrib><creatorcontrib>Yun, Hyokun</creatorcontrib><creatorcontrib>Kwon, Changhyun</creatorcontrib><title>A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone</title><description>Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination -- a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose a hybrid model that uses an attention encoder and a Long Short-Term Memory (LSTM) network decoder, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for the coordinated routing of multiple vehicles than the attention-based model. The proposed model demonstrates comparable results as the operations research baseline methods.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><subject>Mathematics - Optimization and Control</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj09PhDAUxHvxYFY_gCf7BUBeoRSPZNd_CYnG5bgJeS2vCwm0pEtQv73L6mkyM5lJfozdQRJnhZTJA4bvfokFgIhByExes0PJd0QT_6TeWR8MjeRmXhEG17sjL6cpeDQdP3d874dlDeeOeB1woWF1exzoNKLjH8HrgUb-1c8d3wXv6IZdWRxOdPuvG1Y_P9Xb16h6f3nbllWEuZJRpltsc1sYowSiBaulabEAII2qJSUEyCwBKzQlklQOsmhV-mi0wPOUIN2w-7_bC14zhX7E8NOsmM0FM_0Fg1NOyA</recordid><startdate>20211221</startdate><enddate>20211221</enddate><creator>Bogyrbayeva, Aigerim</creator><creator>Yoon, Taehyun</creator><creator>Ko, Hanbum</creator><creator>Lim, Sungbin</creator><creator>Yun, Hyokun</creator><creator>Kwon, Changhyun</creator><scope>AKY</scope><scope>AKZ</scope><scope>GOX</scope></search><sort><creationdate>20211221</creationdate><title>A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone</title><author>Bogyrbayeva, Aigerim ; Yoon, Taehyun ; Ko, Hanbum ; Lim, Sungbin ; Yun, Hyokun ; Kwon, Changhyun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-4bdad6f8cc72aaf1fb5cda811eba7de72215401f2be05e76158d739cb2a4bde13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><topic>Mathematics - Optimization and Control</topic><toplevel>online_resources</toplevel><creatorcontrib>Bogyrbayeva, Aigerim</creatorcontrib><creatorcontrib>Yoon, Taehyun</creatorcontrib><creatorcontrib>Ko, Hanbum</creatorcontrib><creatorcontrib>Lim, Sungbin</creatorcontrib><creatorcontrib>Yun, Hyokun</creatorcontrib><creatorcontrib>Kwon, Changhyun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Mathematics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bogyrbayeva, Aigerim</au><au>Yoon, Taehyun</au><au>Ko, Hanbum</au><au>Lim, Sungbin</au><au>Yun, Hyokun</au><au>Kwon, Changhyun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone</atitle><date>2021-12-21</date><risdate>2021</risdate><abstract>Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination -- a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose a hybrid model that uses an attention encoder and a Long Short-Term Memory (LSTM) network decoder, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for the coordinated routing of multiple vehicles than the attention-based model. The proposed model demonstrates comparable results as the operations research baseline methods.</abstract><doi>10.48550/arxiv.2112.12545</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2112.12545
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2112_12545
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning Mathematics - Optimization and Control
title	A Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drone
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T21%3A51%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Deep%20Reinforcement%20Learning%20Approach%20for%20Solving%20the%20Traveling%20Salesman%20Problem%20with%20Drone&rft.au=Bogyrbayeva,%20Aigerim&rft.date=2021-12-21&rft_id=info:doi/10.48550/arxiv.2112.12545&rft_dat=%3Carxiv_GOX%3E2112_12545%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true