A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy
Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (20...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-08 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Lyu, Jiameng Xie, Jinxing Yuan, Shilin Zhou, Yuan |
description | Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3098951503</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3098951503</sourcerecordid><originalsourceid>FETCH-proquest_journals_30989515033</originalsourceid><addsrcrecordid>eNqNi08LgjAcQEcQJNV3-EHnwdyy9Nj_gqSgbh1k2cyJbrbNwm9fUB-g0zu89zrIo4z5OBxT2kNDawtCCJ1MaRAwD11mEEslr9ylOT5tlnjOrbjBXnCjpLpDLBzHR13KtIVMG9ipp1BOmxZOrXWisvCSLoe41bVM4VA7WfESvsMAdTNeWjH8sY9G69V5scW10Y9GWJcUujHqoxJGojAK_IAw9l_1BtjvQjE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3098951503</pqid></control><display><type>article</type><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><source>Free E- Journals</source><creator>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</creator><creatorcontrib>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</creatorcontrib><description>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Constraints ; Control systems ; Inventory ; Inventory control ; Inventory management ; Learning</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Lyu, Jiameng</creatorcontrib><creatorcontrib>Xie, Jinxing</creatorcontrib><creatorcontrib>Yuan, Shilin</creatorcontrib><creatorcontrib>Zhou, Yuan</creatorcontrib><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><title>arXiv.org</title><description>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</description><subject>Constraints</subject><subject>Control systems</subject><subject>Inventory</subject><subject>Inventory control</subject><subject>Inventory management</subject><subject>Learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi08LgjAcQEcQJNV3-EHnwdyy9Nj_gqSgbh1k2cyJbrbNwm9fUB-g0zu89zrIo4z5OBxT2kNDawtCCJ1MaRAwD11mEEslr9ylOT5tlnjOrbjBXnCjpLpDLBzHR13KtIVMG9ipp1BOmxZOrXWisvCSLoe41bVM4VA7WfESvsMAdTNeWjH8sY9G69V5scW10Y9GWJcUujHqoxJGojAK_IAw9l_1BtjvQjE</recordid><startdate>20240829</startdate><enddate>20240829</enddate><creator>Lyu, Jiameng</creator><creator>Xie, Jinxing</creator><creator>Yuan, Shilin</creator><creator>Zhou, Yuan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240829</creationdate><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><author>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30989515033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Constraints</topic><topic>Control systems</topic><topic>Inventory</topic><topic>Inventory control</topic><topic>Inventory management</topic><topic>Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lyu, Jiameng</creatorcontrib><creatorcontrib>Xie, Jinxing</creatorcontrib><creatorcontrib>Yuan, Shilin</creatorcontrib><creatorcontrib>Zhou, Yuan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lyu, Jiameng</au><au>Xie, Jinxing</au><au>Yuan, Shilin</au><au>Zhou, Yuan</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</atitle><jtitle>arXiv.org</jtitle><date>2024-08-29</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-08 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3098951503 |
source | Free E- Journals |
subjects | Constraints Control systems Inventory Inventory control Inventory management Learning |
title | A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T15%3A09%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20Minibatch-SGD-Based%20Learning%20Meta-Policy%20for%20Inventory%20Systems%20with%20Myopic%20Optimal%20Policy&rft.jtitle=arXiv.org&rft.au=Lyu,%20Jiameng&rft.date=2024-08-29&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3098951503%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3098951503&rft_id=info:pmid/&rfr_iscdi=true |