Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN
In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloa...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on wireless communications 2021-02, Vol.20 (2), p.911-925 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 925 |
---|---|
container_issue | 2 |
container_start_page | 911 |
container_title | IEEE transactions on wireless communications |
container_volume | 20 |
creator | Zhou, Conghao Wu, Wen He, Hongli Yang, Peng Lyu, Feng Cheng, Nan Shen, Xuemin |
description | In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint. |
doi_str_mv | 10.1109/TWC.2020.3029143 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9222519</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9222519</ieee_id><sourcerecordid>2488744844</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</originalsourceid><addsrcrecordid>eNo9kE1PAjEQhhujiYjeTbw08bzYz932SFCRSCSRNR6bbneqi7CLLRz493YD8TSTmeedjxehW0pGlBL9UH5ORowwMuKEaSr4GRpQKVXGmFDnfc7zjLIiv0RXMa4IoUUu5QC9PgJs8Ts0re-Cgw20OzwHG9qm_cKphB9hbQ_ZIjSpAzWedSUubfzBS_cN9X7dY02Ll-Pp7O0aXXi7jnBzikP08fxUTl6y-WI6m4znmeOy2GW5kB6cYk7WdS59Otp7UFZUgnHIq0o4KYu60qLSEjgRlGnppa-JZZVTmvEhuj_O3Ybudw9xZ1bdPrRppUnPqkIIJUSiyJFyoYsxgDfb0GxsOBhKTG-ZSZaZ3jJzsixJ7o6SBgD-cc0Yk1TzP-YoZg4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2488744844</pqid></control><display><type>article</type><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><source>IEEE Electronic Library (IEL)</source><creator>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</creator><creatorcontrib>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</creatorcontrib><description>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</description><identifier>ISSN: 1536-1276</identifier><identifier>EISSN: 1558-2248</identifier><identifier>DOI: 10.1109/TWC.2020.3029143</identifier><identifier>CODEN: ITWCAX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Computation ; Computer aided scheduling ; constrained MDP ; Constraints ; Delay ; Delays ; edge computing ; Energy consumption ; Heuristic algorithms ; Internet of Things ; IoT ; Machine learning ; Markov processes ; Optimization ; Processor scheduling ; reinforcement learning ; Risk ; Scheduling ; Space-air-ground integrated network ; Task analysis ; Task scheduling ; Unmanned aerial vehicles ; US Department of Transportation</subject><ispartof>IEEE transactions on wireless communications, 2021-02, Vol.20 (2), p.911-925</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</citedby><cites>FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</cites><orcidid>0000-0001-8964-0597 ; 0000-0002-1283-2168 ; 0000-0002-5727-2432 ; 0000-0002-0458-1282 ; 0000-0001-7907-2071 ; 0000-0002-2990-5415 ; 0000-0002-4140-287X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9222519$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9222519$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhou, Conghao</creatorcontrib><creatorcontrib>Wu, Wen</creatorcontrib><creatorcontrib>He, Hongli</creatorcontrib><creatorcontrib>Yang, Peng</creatorcontrib><creatorcontrib>Lyu, Feng</creatorcontrib><creatorcontrib>Cheng, Nan</creatorcontrib><creatorcontrib>Shen, Xuemin</creatorcontrib><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><title>IEEE transactions on wireless communications</title><addtitle>TWC</addtitle><description>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</description><subject>Algorithms</subject><subject>Computation</subject><subject>Computer aided scheduling</subject><subject>constrained MDP</subject><subject>Constraints</subject><subject>Delay</subject><subject>Delays</subject><subject>edge computing</subject><subject>Energy consumption</subject><subject>Heuristic algorithms</subject><subject>Internet of Things</subject><subject>IoT</subject><subject>Machine learning</subject><subject>Markov processes</subject><subject>Optimization</subject><subject>Processor scheduling</subject><subject>reinforcement learning</subject><subject>Risk</subject><subject>Scheduling</subject><subject>Space-air-ground integrated network</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Unmanned aerial vehicles</subject><subject>US Department of Transportation</subject><issn>1536-1276</issn><issn>1558-2248</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1PAjEQhhujiYjeTbw08bzYz932SFCRSCSRNR6bbneqi7CLLRz493YD8TSTmeedjxehW0pGlBL9UH5ORowwMuKEaSr4GRpQKVXGmFDnfc7zjLIiv0RXMa4IoUUu5QC9PgJs8Ts0re-Cgw20OzwHG9qm_cKphB9hbQ_ZIjSpAzWedSUubfzBS_cN9X7dY02Ll-Pp7O0aXXi7jnBzikP08fxUTl6y-WI6m4znmeOy2GW5kB6cYk7WdS59Otp7UFZUgnHIq0o4KYu60qLSEjgRlGnppa-JZZVTmvEhuj_O3Ybudw9xZ1bdPrRppUnPqkIIJUSiyJFyoYsxgDfb0GxsOBhKTG-ZSZaZ3jJzsixJ7o6SBgD-cc0Yk1TzP-YoZg4</recordid><startdate>202102</startdate><enddate>202102</enddate><creator>Zhou, Conghao</creator><creator>Wu, Wen</creator><creator>He, Hongli</creator><creator>Yang, Peng</creator><creator>Lyu, Feng</creator><creator>Cheng, Nan</creator><creator>Shen, Xuemin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-8964-0597</orcidid><orcidid>https://orcid.org/0000-0002-1283-2168</orcidid><orcidid>https://orcid.org/0000-0002-5727-2432</orcidid><orcidid>https://orcid.org/0000-0002-0458-1282</orcidid><orcidid>https://orcid.org/0000-0001-7907-2071</orcidid><orcidid>https://orcid.org/0000-0002-2990-5415</orcidid><orcidid>https://orcid.org/0000-0002-4140-287X</orcidid></search><sort><creationdate>202102</creationdate><title>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</title><author>Zhou, Conghao ; Wu, Wen ; He, Hongli ; Yang, Peng ; Lyu, Feng ; Cheng, Nan ; Shen, Xuemin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-645fec82c5dd65f202ffe8a4b423e6bb4c557db94b95e3041295f5fd0a2bc8923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Computation</topic><topic>Computer aided scheduling</topic><topic>constrained MDP</topic><topic>Constraints</topic><topic>Delay</topic><topic>Delays</topic><topic>edge computing</topic><topic>Energy consumption</topic><topic>Heuristic algorithms</topic><topic>Internet of Things</topic><topic>IoT</topic><topic>Machine learning</topic><topic>Markov processes</topic><topic>Optimization</topic><topic>Processor scheduling</topic><topic>reinforcement learning</topic><topic>Risk</topic><topic>Scheduling</topic><topic>Space-air-ground integrated network</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Unmanned aerial vehicles</topic><topic>US Department of Transportation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Conghao</creatorcontrib><creatorcontrib>Wu, Wen</creatorcontrib><creatorcontrib>He, Hongli</creatorcontrib><creatorcontrib>Yang, Peng</creatorcontrib><creatorcontrib>Lyu, Feng</creatorcontrib><creatorcontrib>Cheng, Nan</creatorcontrib><creatorcontrib>Shen, Xuemin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on wireless communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhou, Conghao</au><au>Wu, Wen</au><au>He, Hongli</au><au>Yang, Peng</au><au>Lyu, Feng</au><au>Cheng, Nan</au><au>Shen, Xuemin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN</atitle><jtitle>IEEE transactions on wireless communications</jtitle><stitle>TWC</stitle><date>2021-02</date><risdate>2021</risdate><volume>20</volume><issue>2</issue><spage>911</spage><epage>925</epage><pages>911-925</pages><issn>1536-1276</issn><eissn>1558-2248</eissn><coden>ITWCAX</coden><abstract>In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TWC.2020.3029143</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8964-0597</orcidid><orcidid>https://orcid.org/0000-0002-1283-2168</orcidid><orcidid>https://orcid.org/0000-0002-5727-2432</orcidid><orcidid>https://orcid.org/0000-0002-0458-1282</orcidid><orcidid>https://orcid.org/0000-0001-7907-2071</orcidid><orcidid>https://orcid.org/0000-0002-2990-5415</orcidid><orcidid>https://orcid.org/0000-0002-4140-287X</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1536-1276 |
ispartof | IEEE transactions on wireless communications, 2021-02, Vol.20 (2), p.911-925 |
issn | 1536-1276 1558-2248 |
language | eng |
recordid | cdi_ieee_primary_9222519 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Computation Computer aided scheduling constrained MDP Constraints Delay Delays edge computing Energy consumption Heuristic algorithms Internet of Things IoT Machine learning Markov processes Optimization Processor scheduling reinforcement learning Risk Scheduling Space-air-ground integrated network Task analysis Task scheduling Unmanned aerial vehicles US Department of Transportation |
title | Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T02%3A31%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Reinforcement%20Learning%20for%20Delay-Oriented%20IoT%20Task%20Scheduling%20in%20SAGIN&rft.jtitle=IEEE%20transactions%20on%20wireless%20communications&rft.au=Zhou,%20Conghao&rft.date=2021-02&rft.volume=20&rft.issue=2&rft.spage=911&rft.epage=925&rft.pages=911-925&rft.issn=1536-1276&rft.eissn=1558-2248&rft.coden=ITWCAX&rft_id=info:doi/10.1109/TWC.2020.3029143&rft_dat=%3Cproquest_RIE%3E2488744844%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2488744844&rft_id=info:pmid/&rft_ieee_id=9222519&rfr_iscdi=true |