A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy

Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (20...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-08
Hauptverfasser: Lyu, Jiameng, Xie, Jinxing, Yuan, Shilin, Zhou, Yuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Lyu, Jiameng
Xie, Jinxing
Yuan, Shilin
Zhou, Yuan
description Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3098951503</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3098951503</sourcerecordid><originalsourceid>FETCH-proquest_journals_30989515033</originalsourceid><addsrcrecordid>eNqNi08LgjAcQEcQJNV3-EHnwdyy9Nj_gqSgbh1k2cyJbrbNwm9fUB-g0zu89zrIo4z5OBxT2kNDawtCCJ1MaRAwD11mEEslr9ylOT5tlnjOrbjBXnCjpLpDLBzHR13KtIVMG9ipp1BOmxZOrXWisvCSLoe41bVM4VA7WfESvsMAdTNeWjH8sY9G69V5scW10Y9GWJcUujHqoxJGojAK_IAw9l_1BtjvQjE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3098951503</pqid></control><display><type>article</type><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><source>Free E- Journals</source><creator>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</creator><creatorcontrib>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</creatorcontrib><description>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Constraints ; Control systems ; Inventory ; Inventory control ; Inventory management ; Learning</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Lyu, Jiameng</creatorcontrib><creatorcontrib>Xie, Jinxing</creatorcontrib><creatorcontrib>Yuan, Shilin</creatorcontrib><creatorcontrib>Zhou, Yuan</creatorcontrib><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><title>arXiv.org</title><description>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</description><subject>Constraints</subject><subject>Control systems</subject><subject>Inventory</subject><subject>Inventory control</subject><subject>Inventory management</subject><subject>Learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi08LgjAcQEcQJNV3-EHnwdyy9Nj_gqSgbh1k2cyJbrbNwm9fUB-g0zu89zrIo4z5OBxT2kNDawtCCJ1MaRAwD11mEEslr9ylOT5tlnjOrbjBXnCjpLpDLBzHR13KtIVMG9ipp1BOmxZOrXWisvCSLoe41bVM4VA7WfESvsMAdTNeWjH8sY9G69V5scW10Y9GWJcUujHqoxJGojAK_IAw9l_1BtjvQjE</recordid><startdate>20240829</startdate><enddate>20240829</enddate><creator>Lyu, Jiameng</creator><creator>Xie, Jinxing</creator><creator>Yuan, Shilin</creator><creator>Zhou, Yuan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240829</creationdate><title>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</title><author>Lyu, Jiameng ; Xie, Jinxing ; Yuan, Shilin ; Zhou, Yuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30989515033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Constraints</topic><topic>Control systems</topic><topic>Inventory</topic><topic>Inventory control</topic><topic>Inventory management</topic><topic>Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lyu, Jiameng</creatorcontrib><creatorcontrib>Xie, Jinxing</creatorcontrib><creatorcontrib>Yuan, Shilin</creatorcontrib><creatorcontrib>Zhou, Yuan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lyu, Jiameng</au><au>Xie, Jinxing</au><au>Yuan, Shilin</au><au>Zhou, Yuan</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy</atitle><jtitle>arXiv.org</jtitle><date>2024-08-29</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Stochastic gradient descent (SGD) has proven effective in solving many inventory control problems with demand learning. However, it often faces the pitfall of an infeasible target inventory level that is lower than the current inventory level. Several recent works (e.g., Huh and Rusmevichientong (2009), Shi et al.(2016)) are successful to resolve this issue in various inventory systems. However, their techniques are rather sophisticated and difficult to be applied to more complicated scenarios such as multi-product and multi-constraint inventory systems. In this paper, we address the infeasible-target-inventory-level issue from a new technical perspective -- we propose a novel minibatch-SGD-based meta-policy. Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy. By devising the optimal minibatch scheme, our meta-policy achieves a regret bound of \(\mathcal{O}(\sqrt{T})\) for the general convex case and \(\mathcal{O}(\log T)\) for the strongly convex case. To demonstrate the power and flexibility of our meta-policy, we apply it to three important inventory control problems: multi-product and multi-constraint systems, multi-echelon serial systems, and one-warehouse and multi-store systems by carefully designing application-specific subroutines.We also conduct extensive numerical experiments to demonstrate that our meta-policy enjoys competitive regret performance, high computational efficiency, and low variances among a wide range of applications.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-08
issn 2331-8422
language eng
recordid cdi_proquest_journals_3098951503
source Free E- Journals
subjects Constraints
Control systems
Inventory
Inventory control
Inventory management
Learning
title A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T15%3A09%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20Minibatch-SGD-Based%20Learning%20Meta-Policy%20for%20Inventory%20Systems%20with%20Myopic%20Optimal%20Policy&rft.jtitle=arXiv.org&rft.au=Lyu,%20Jiameng&rft.date=2024-08-29&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3098951503%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3098951503&rft_id=info:pmid/&rfr_iscdi=true