TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control

The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational mathematics and mathematical physics 2021-05, Vol.61 (5), p.836-846
Hauptverfasser: Boyko, A. I., Oseledets, I. V., Ferrer, G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 846
container_issue 5
container_start_page 836
container_title Computational mathematics and mathematical physics
container_volume 61
creator Boyko, A. I.
Oseledets, I. V.
Ferrer, G.
description The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the TT-VI algorithm called TT-Q-Iteration (TT-QI) is proposed by authors. In it, the nonlinear Bellman Optimality Operator is iteratively applied to the solution as a composition of internal Tensor Train algebraic operations and TT-CROSS algorithm. We show that it has lower asymptotic complexity per iteration than the method existing in the literature, provided that TT-ranks of transition probabilities are small. In test examples of an underpowered inverted pendulum and Dubins cars our method shows up to 3–10 times faster convergence in terms of wall clock time compared with the original method.
doi_str_mv 10.1134/S0965542521050043
format Article
fullrecord <record><control><sourceid>proquest_webof</sourceid><recordid>TN_cdi_proquest_journals_2547574742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2547574742</sourcerecordid><originalsourceid>FETCH-LOGICAL-c198t-17c653250ab4d27ab3e53d1a50dbefed8d2e716526eec2188089a8e8da6ef85c3</originalsourceid><addsrcrecordid>eNqNkF1LwzAUhoMoOKc_wLuAl1JN0iZNvZPidDAQWfW2pOmpdnTJTDLEf29mRS9E8Cof53lOTl6ETim5oDTNLpekEJxnjDNKOCFZuocmlHOeCCHYPprsysmufoiOvF8RQkUh0wlaVlXyML_CM-UDOPykhi3gedyq0FuDe4MrMN46XDkVDzPr1irgLl4sg9Uv0eo1vt-Efq0GXFoTnB2O0UGnBg8nX-sUPc5uqvIuWdzfzsvrRaJpIUNCcy14yjhRTdayXDUp8LSlipO2gQ5a2TLIqeBMAGhGpSSyUBJkqwR0kut0is7GvhtnX7fgQ72yW2fikzXjWc7zLM9YpOhIaWe9d9DVGxende81JfUuu_pXdtGRo_MGje287sFo-PYIIULIQogIE5qWffgMq7RbE6J6_n810mykfSTMM7ifL_w93QdyipBJ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2547574742</pqid></control><display><type>article</type><title>TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control</title><source>SpringerNature Journals</source><source>Web of Science - Science Citation Index Expanded - 2021&lt;img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" /&gt;</source><creator>Boyko, A. I. ; Oseledets, I. V. ; Ferrer, G.</creator><creatorcontrib>Boyko, A. I. ; Oseledets, I. V. ; Ferrer, G.</creatorcontrib><description>The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the TT-VI algorithm called TT-Q-Iteration (TT-QI) is proposed by authors. In it, the nonlinear Bellman Optimality Operator is iteratively applied to the solution as a composition of internal Tensor Train algebraic operations and TT-CROSS algorithm. We show that it has lower asymptotic complexity per iteration than the method existing in the literature, provided that TT-ranks of transition probabilities are small. In test examples of an underpowered inverted pendulum and Dubins cars our method shows up to 3–10 times faster convergence in terms of wall clock time compared with the original method.</description><identifier>ISSN: 0965-5425</identifier><identifier>EISSN: 1555-6662</identifier><identifier>DOI: 10.1134/S0965542521050043</identifier><language>eng</language><publisher>Moscow: Pleiades Publishing</publisher><subject>Algorithms ; Asymptotic methods ; Computational Mathematics and Numerical Analysis ; Format ; Markov processes ; Mathematical analysis ; Mathematics ; Mathematics and Statistics ; Mathematics, Applied ; Noise control ; Nonlinear control ; Optimal Control ; Optimization ; Physical Sciences ; Physics ; Physics, Mathematical ; Railroad cars ; Science &amp; Technology ; Tensors ; Transition probabilities</subject><ispartof>Computational mathematics and mathematical physics, 2021-05, Vol.61 (5), p.836-846</ispartof><rights>Pleiades Publishing, Ltd. 2021. ISSN 0965-5425, Computational Mathematics and Mathematical Physics, 2021, Vol. 61, No. 5, pp. 836–846. © Pleiades Publishing, Ltd., 2021. Russian Text © The Author(s), 2021, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2021, Vol. 61, No. 5, pp. 865–877.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>2</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000668966500013</woscitedreferencesoriginalsourcerecordid><cites>FETCH-LOGICAL-c198t-17c653250ab4d27ab3e53d1a50dbefed8d2e716526eec2188089a8e8da6ef85c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1134/S0965542521050043$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1134/S0965542521050043$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>315,782,786,27931,27932,39265,41495,42564,51326</link.rule.ids></links><search><creatorcontrib>Boyko, A. I.</creatorcontrib><creatorcontrib>Oseledets, I. V.</creatorcontrib><creatorcontrib>Ferrer, G.</creatorcontrib><title>TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control</title><title>Computational mathematics and mathematical physics</title><addtitle>Comput. Math. and Math. Phys</addtitle><addtitle>COMP MATH MATH PHYS</addtitle><description>The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the TT-VI algorithm called TT-Q-Iteration (TT-QI) is proposed by authors. In it, the nonlinear Bellman Optimality Operator is iteratively applied to the solution as a composition of internal Tensor Train algebraic operations and TT-CROSS algorithm. We show that it has lower asymptotic complexity per iteration than the method existing in the literature, provided that TT-ranks of transition probabilities are small. In test examples of an underpowered inverted pendulum and Dubins cars our method shows up to 3–10 times faster convergence in terms of wall clock time compared with the original method.</description><subject>Algorithms</subject><subject>Asymptotic methods</subject><subject>Computational Mathematics and Numerical Analysis</subject><subject>Format</subject><subject>Markov processes</subject><subject>Mathematical analysis</subject><subject>Mathematics</subject><subject>Mathematics and Statistics</subject><subject>Mathematics, Applied</subject><subject>Noise control</subject><subject>Nonlinear control</subject><subject>Optimal Control</subject><subject>Optimization</subject><subject>Physical Sciences</subject><subject>Physics</subject><subject>Physics, Mathematical</subject><subject>Railroad cars</subject><subject>Science &amp; Technology</subject><subject>Tensors</subject><subject>Transition probabilities</subject><issn>0965-5425</issn><issn>1555-6662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>HGBXW</sourceid><recordid>eNqNkF1LwzAUhoMoOKc_wLuAl1JN0iZNvZPidDAQWfW2pOmpdnTJTDLEf29mRS9E8Cof53lOTl6ETim5oDTNLpekEJxnjDNKOCFZuocmlHOeCCHYPprsysmufoiOvF8RQkUh0wlaVlXyML_CM-UDOPykhi3gedyq0FuDe4MrMN46XDkVDzPr1irgLl4sg9Uv0eo1vt-Efq0GXFoTnB2O0UGnBg8nX-sUPc5uqvIuWdzfzsvrRaJpIUNCcy14yjhRTdayXDUp8LSlipO2gQ5a2TLIqeBMAGhGpSSyUBJkqwR0kut0is7GvhtnX7fgQ72yW2fikzXjWc7zLM9YpOhIaWe9d9DVGxende81JfUuu_pXdtGRo_MGje287sFo-PYIIULIQogIE5qWffgMq7RbE6J6_n810mykfSTMM7ifL_w93QdyipBJ</recordid><startdate>20210501</startdate><enddate>20210501</enddate><creator>Boyko, A. I.</creator><creator>Oseledets, I. V.</creator><creator>Ferrer, G.</creator><general>Pleiades Publishing</general><general>Pleiades Publishing Inc</general><general>Springer Nature B.V</general><scope>BLEPL</scope><scope>DTL</scope><scope>HGBXW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7TB</scope><scope>7U5</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20210501</creationdate><title>TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control</title><author>Boyko, A. I. ; Oseledets, I. V. ; Ferrer, G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c198t-17c653250ab4d27ab3e53d1a50dbefed8d2e716526eec2188089a8e8da6ef85c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Asymptotic methods</topic><topic>Computational Mathematics and Numerical Analysis</topic><topic>Format</topic><topic>Markov processes</topic><topic>Mathematical analysis</topic><topic>Mathematics</topic><topic>Mathematics and Statistics</topic><topic>Mathematics, Applied</topic><topic>Noise control</topic><topic>Nonlinear control</topic><topic>Optimal Control</topic><topic>Optimization</topic><topic>Physical Sciences</topic><topic>Physics</topic><topic>Physics, Mathematical</topic><topic>Railroad cars</topic><topic>Science &amp; Technology</topic><topic>Tensors</topic><topic>Transition probabilities</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Boyko, A. I.</creatorcontrib><creatorcontrib>Oseledets, I. V.</creatorcontrib><creatorcontrib>Ferrer, G.</creatorcontrib><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>Web of Science - Science Citation Index Expanded - 2021</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computational mathematics and mathematical physics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Boyko, A. I.</au><au>Oseledets, I. V.</au><au>Ferrer, G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control</atitle><jtitle>Computational mathematics and mathematical physics</jtitle><stitle>Comput. Math. and Math. Phys</stitle><stitle>COMP MATH MATH PHYS</stitle><date>2021-05-01</date><risdate>2021</risdate><volume>61</volume><issue>5</issue><spage>836</spage><epage>846</epage><pages>836-846</pages><issn>0965-5425</issn><eissn>1555-6662</eissn><abstract>The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the TT-VI algorithm called TT-Q-Iteration (TT-QI) is proposed by authors. In it, the nonlinear Bellman Optimality Operator is iteratively applied to the solution as a composition of internal Tensor Train algebraic operations and TT-CROSS algorithm. We show that it has lower asymptotic complexity per iteration than the method existing in the literature, provided that TT-ranks of transition probabilities are small. In test examples of an underpowered inverted pendulum and Dubins cars our method shows up to 3–10 times faster convergence in terms of wall clock time compared with the original method.</abstract><cop>Moscow</cop><pub>Pleiades Publishing</pub><doi>10.1134/S0965542521050043</doi><tpages>11</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0965-5425
ispartof Computational mathematics and mathematical physics, 2021-05, Vol.61 (5), p.836-846
issn 0965-5425
1555-6662
language eng
recordid cdi_proquest_journals_2547574742
source SpringerNature Journals; Web of Science - Science Citation Index Expanded - 2021<img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" />
subjects Algorithms
Asymptotic methods
Computational Mathematics and Numerical Analysis
Format
Markov processes
Mathematical analysis
Mathematics
Mathematics and Statistics
Mathematics, Applied
Noise control
Nonlinear control
Optimal Control
Optimization
Physical Sciences
Physics
Physics, Mathematical
Railroad cars
Science & Technology
Tensors
Transition probabilities
title TT-QI: Faster Value Iteration in Tensor Train Format for Stochastic Optimal Control
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T04%3A02%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_webof&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TT-QI:%20Faster%20Value%20Iteration%20in%20Tensor%20Train%20Format%20for%20Stochastic%20Optimal%20Control&rft.jtitle=Computational%20mathematics%20and%20mathematical%20physics&rft.au=Boyko,%20A.%20I.&rft.date=2021-05-01&rft.volume=61&rft.issue=5&rft.spage=836&rft.epage=846&rft.pages=836-846&rft.issn=0965-5425&rft.eissn=1555-6662&rft_id=info:doi/10.1134/S0965542521050043&rft_dat=%3Cproquest_webof%3E2547574742%3C/proquest_webof%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2547574742&rft_id=info:pmid/&rfr_iscdi=true