Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN

In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on wireless communications 2021-02, Vol.20 (2), p.911-925
Hauptverfasser: Zhou, Conghao, Wu, Wen, He, Hongli, Yang, Peng, Lyu, Feng, Cheng, Nan, Shen, Xuemin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 925
container_issue 2
container_start_page 911
container_title IEEE transactions on wireless communications
container_volume 20
creator Zhou, Conghao
Wu, Wen
He, Hongli
Yang, Peng
Lyu, Feng
Cheng, Nan
Shen, Xuemin
description In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.
doi_str_mv 10.1109/TWC.2020.3029143
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9222519</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9222519</ieee_id><sourcerecordid>2488744844</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</originalsourceid><addsrcrecordid>eNo9kE1PAjEQhhujiYjeTbw08bzYz932SFCRSCSRNR6bbneqi7CLLRz493YD8TSTmeedjxehW0pGlBL9UH5ORowwMuKEaSr4GRpQKVXGmFDnfc7zjLIiv0RXMa4IoUUu5QC9PgJs8Ts0re-Cgw20OzwHG9qm_cKphB9hbQ_ZIjSpAzWedSUubfzBS_cN9X7dY02Ll-Pp7O0aXXi7jnBzikP08fxUTl6y-WI6m4znmeOy2GW5kB6cYk7WdS59Otp7UFZUgnHIq0o4KYu60qLSEjgRlGnppa-JZZVTmvEhuj_O3Ybudw9xZ1bdPrRppUnPqkIIJUSiyJFyoYsxgDfb0GxsOBhKTG-ZSZaZ3jJzsixJ7o6SBgD-cc0Yk1TzP-YoZg4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2488744844</pqid></control><display><type>article</type><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><source>IEEE Electronic Library (IEL)</source><creator>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</creator><creatorcontrib>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</creatorcontrib><description>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2020.3029143</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Computation ; Computer aided scheduling ; constrained MDP ; Constraints ; Delay ; Delays ; edge computing ; Energy consumption ; Heuristic algorithms ; Internet of Things ; IoT ; Machine learning ; Markov processes ; Optimization ; Processor scheduling ; reinforcement learning ; Risk ; Scheduling ; Space-air-ground integrated network ; Task analysis ; Task scheduling ; Unmanned aerial vehicles ; US Department of Transportation</subject><ispartof>IEEE transactions on wireless communications, 2021-02, Vol.20 (2), p.911-925</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</citedby><cites>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</cites><orcidid>0000-0001-8964-0597 ; 0000-0002-1283-2168 ; 0000-0002-5727-2432 ; 0000-0002-0458-1282 ; 0000-0001-7907-2071 ; 0000-0002-2990-5415 ; 0000-0002-4140-287X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9222519$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9222519$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhou, Conghao</creatorcontrib><creatorcontrib>Wu, Wen</creatorcontrib><creatorcontrib>He, Hongli</creatorcontrib><creatorcontrib>Yang, Peng</creatorcontrib><creatorcontrib>Lyu, Feng</creatorcontrib><creatorcontrib>Cheng, Nan</creatorcontrib><creatorcontrib>Shen, Xuemin</creatorcontrib><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</description><subject>Algorithms</subject><subject>Computation</subject><subject>Computer aided scheduling</subject><subject>constrained MDP</subject><subject>Constraints</subject><subject>Delay</subject><subject>Delays</subject><subject>edge computing</subject><subject>Energy consumption</subject><subject>Heuristic algorithms</subject><subject>Internet of Things</subject><subject>IoT</subject><subject>Machine learning</subject><subject>Markov processes</subject><subject>Optimization</subject><subject>Processor scheduling</subject><subject>reinforcement learning</subject><subject>Risk</subject><subject>Scheduling</subject><subject>Space-air-ground integrated network</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Unmanned aerial vehicles</subject><subject>US Department of Transportation</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1PAjEQhhujiYjeTbw08bzYz932SFCRSCSRNR6bbneqi7CLLRz493YD8TSTmeedjxehW0pGlBL9UH5ORowwMuKEaSr4GRpQKVXGmFDnfc7zjLIiv0RXMa4IoUUu5QC9PgJs8Ts0re-Cgw20OzwHG9qm_cKphB9hbQ_ZIjSpAzWedSUubfzBS_cN9X7dY02Ll-Pp7O0aXXi7jnBzikP08fxUTl6y-WI6m4znmeOy2GW5kB6cYk7WdS59Otp7UFZUgnHIq0o4KYu60qLSEjgRlGnppa-JZZVTmvEhuj_O3Ybudw9xZ1bdPrRppUnPqkIIJUSiyJFyoYsxgDfb0GxsOBhKTG-ZSZaZ3jJzsixJ7o6SBgD-cc0Yk1TzP-YoZg4</recordid><startdate>202102</startdate><enddate>202102</enddate><creator>Zhou, Conghao</creator><creator>Wu, Wen</creator><creator>He, Hongli</creator><creator>Yang, Peng</creator><creator>Lyu, Feng</creator><creator>Cheng, Nan</creator><creator>Shen, Xuemin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8964-0597</orcidid><orcidid>https://orcid.org/0000-0002-1283-2168</orcidid><orcidid>https://orcid.org/0000-0002-5727-2432</orcidid><orcidid>https://orcid.org/0000-0002-0458-1282</orcidid><orcidid>https://orcid.org/0000-0001-7907-2071</orcidid><orcidid>https://orcid.org/0000-0002-2990-5415</orcidid><orcidid>https://orcid.org/0000-0002-4140-287X</orcidid></search><sort><creationdate>202102</creationdate><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><author>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Computation</topic><topic>Computer aided scheduling</topic><topic>constrained MDP</topic><topic>Constraints</topic><topic>Delay</topic><topic>Delays</topic><topic>edge computing</topic><topic>Energy consumption</topic><topic>Heuristic algorithms</topic><topic>Internet of Things</topic><topic>IoT</topic><topic>Machine learning</topic><topic>Markov processes</topic><topic>Optimization</topic><topic>Processor scheduling</topic><topic>reinforcement learning</topic><topic>Risk</topic><topic>Scheduling</topic><topic>Space-air-ground integrated network</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Unmanned aerial vehicles</topic><topic>US Department of Transportation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Conghao</creatorcontrib><creatorcontrib>Wu, Wen</creatorcontrib><creatorcontrib>He, Hongli</creatorcontrib><creatorcontrib>Yang, Peng</creatorcontrib><creatorcontrib>Lyu, Feng</creatorcontrib><creatorcontrib>Cheng, Nan</creatorcontrib><creatorcontrib>Shen, Xuemin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhou, Conghao</au><au>Wu, Wen</au><au>He, Hongli</au><au>Yang, Peng</au><au>Lyu, Feng</au><au>Cheng, Nan</au><au>Shen, Xuemin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2021-02</date><risdate>2021</risdate><volume>20</volume><issue>2</issue><spage>911</spage><epage>925</epage><pages>911-925</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2020.3029143</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8964-0597</orcidid><orcidid>https://orcid.org/0000-0002-1283-2168</orcidid><orcidid>https://orcid.org/0000-0002-5727-2432</orcidid><orcidid>https://orcid.org/0000-0002-0458-1282</orcidid><orcidid>https://orcid.org/0000-0001-7907-2071</orcidid><orcidid>https://orcid.org/0000-0002-2990-5415</orcidid><orcidid>https://orcid.org/0000-0002-4140-287X</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1536-1276
ispartof IEEE transactions on wireless communications, 2021-02, Vol.20 (2), p.911-925
issn 1536-1276
1558-2248
language eng
recordid cdi_ieee_primary_9222519
source IEEE Electronic Library (IEL)
subjects Algorithms
Computation
Computer aided scheduling
constrained MDP
Constraints
Delay
Delays
edge computing
Energy consumption
Heuristic algorithms
Internet of Things
IoT
Machine learning
Markov processes
Optimization
Processor scheduling
reinforcement learning
Risk
Scheduling
Space-air-ground integrated network
Task analysis
Task scheduling
Unmanned aerial vehicles
US Department of Transportation
title Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T02%3A31%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20for%20Delay-Oriented%20IoT%20Task%20Scheduling%20in%20SAGIN&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Zhou,%20Conghao&rft.date=2021-02&rft.volume=20&rft.issue=2&rft.spage=911&rft.epage=925&rft.pages=911-925&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2020.3029143&rft_dat=%3Cproquest_RIE%3E2488744844%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2488744844&rft_id=info:pmid/&rft_ieee_id=9222519&rfr_iscdi=true