Pseudo-model-free hedging for variable annuities via deep reinforcement learning

This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Annals of actuarial science 2023-11, Vol.17 (3), p.503-546
Hauptverfasser:	Chong, Wing Fung, Cui, Haoen, Li, Yuxuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Actuarial science Distance learning Hedging Insurance premiums Machine learning Mortality Neural networks Original Research Paper Profits Securities markets Valuation Variable annuities
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	546
container_issue	3
container_start_page	503
container_title	Annals of actuarial science
container_volume	17
creator	Chong, Wing Fung Cui, Haoen Li, Yuxuan
description	This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.
doi_str_mv	10.1017/S1748499523000027
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2881259343</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_S1748499523000027</cupid><sourcerecordid>2881259343</sourcerecordid><originalsourceid>FETCH-LOGICAL-c450t-4aa306d693994ba35c8bd734a104578084f5a2faf6316306d3222104e1c3d9c63</originalsourceid><addsrcrecordid>eNp1kF1LwzAUhoMoOOZ-gHcBr6v5aptcyvBjMHCgXpc0OZ0ZbTqTduC_N2WDXYi5SU7O877v4SB0S8k9JbR8eKelkEKpnHGSDisv0Gz6yvJUXJ7eU_8aLWLcTYxQRHI1Q5tNhNH2WddbaLMmAOAvsFvnt7jpAz7o4HTdAtbej25wEPHBaWwB9jiA84kx0IEfcAs6-CS7QVeNbiMsTvccfT4_fSxfs_Xby2r5uM6MyMmQCa05KWyhuFKi1jw3srYlF5oSkZeSSNHkmjW6KTgtJpIzxlIPqOFWmYLP0d3Rdx_67xHiUO36MfgUWTEpKcsVFzxR9EiZ0McYoKn2wXU6_FSUVNPuqj-7Sxp81IDpvYtnhSyZSvNymhB-stVdHZzdwjn9f-NfyMZ5lw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2881259343</pqid></control><display><type>article</type><title>Pseudo-model-free hedging for variable annuities via deep reinforcement learning</title><source>Cambridge University Press Journals Complete</source><creator>Chong, Wing Fung ; Cui, Haoen ; Li, Yuxuan</creator><creatorcontrib>Chong, Wing Fung ; Cui, Haoen ; Li, Yuxuan</creatorcontrib><description>This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.</description><identifier>ISSN: 1748-4995</identifier><identifier>EISSN: 1748-5002</identifier><identifier>DOI: 10.1017/S1748499523000027</identifier><language>eng</language><publisher>Cambridge, UK: Cambridge University Press</publisher><subject>Actuarial science ; Distance learning ; Hedging ; Insurance premiums ; Machine learning ; Mortality ; Neural networks ; Original Research Paper ; Profits ; Securities markets ; Valuation ; Variable annuities</subject><ispartof>Annals of actuarial science, 2023-11, Vol.17 (3), p.503-546</ispartof><rights>The Author(s), 2023. Published by Cambridge University Press on behalf of Institute and Faculty of Actuaries</rights><rights>The Author(s), 2023. Published by Cambridge University Press on behalf of Institute and Faculty of Actuaries. This work is licensed under the Creative Commons Attribution License This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited. (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c450t-4aa306d693994ba35c8bd734a104578084f5a2faf6316306d3222104e1c3d9c63</citedby><cites>FETCH-LOGICAL-c450t-4aa306d693994ba35c8bd734a104578084f5a2faf6316306d3222104e1c3d9c63</cites><orcidid>0000-0002-7749-6476</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S1748499523000027/type/journal_article$$EHTML$$P50$$Gcambridge$$Hfree_for_read</linktohtml><link.rule.ids>164,314,776,780,27903,27904,55606</link.rule.ids></links><search><creatorcontrib>Chong, Wing Fung</creatorcontrib><creatorcontrib>Cui, Haoen</creatorcontrib><creatorcontrib>Li, Yuxuan</creatorcontrib><title>Pseudo-model-free hedging for variable annuities via deep reinforcement learning</title><title>Annals of actuarial science</title><addtitle>Ann. actuar. sci</addtitle><description>This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.</description><subject>Actuarial science</subject><subject>Distance learning</subject><subject>Hedging</subject><subject>Insurance premiums</subject><subject>Machine learning</subject><subject>Mortality</subject><subject>Neural networks</subject><subject>Original Research Paper</subject><subject>Profits</subject><subject>Securities markets</subject><subject>Valuation</subject><subject>Variable annuities</subject><issn>1748-4995</issn><issn>1748-5002</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>IKXGN</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp1kF1LwzAUhoMoOOZ-gHcBr6v5aptcyvBjMHCgXpc0OZ0ZbTqTduC_N2WDXYi5SU7O877v4SB0S8k9JbR8eKelkEKpnHGSDisv0Gz6yvJUXJ7eU_8aLWLcTYxQRHI1Q5tNhNH2WddbaLMmAOAvsFvnt7jpAz7o4HTdAtbej25wEPHBaWwB9jiA84kx0IEfcAs6-CS7QVeNbiMsTvccfT4_fSxfs_Xby2r5uM6MyMmQCa05KWyhuFKi1jw3srYlF5oSkZeSSNHkmjW6KTgtJpIzxlIPqOFWmYLP0d3Rdx_67xHiUO36MfgUWTEpKcsVFzxR9EiZ0McYoKn2wXU6_FSUVNPuqj-7Sxp81IDpvYtnhSyZSvNymhB-stVdHZzdwjn9f-NfyMZ5lw</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Chong, Wing Fung</creator><creator>Cui, Haoen</creator><creator>Li, Yuxuan</creator><general>Cambridge University Press</general><scope>IKXGN</scope><scope>OQ6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0U~</scope><scope>1-H</scope><scope>3V.</scope><scope>7WY</scope><scope>7WZ</scope><scope>7X1</scope><scope>7XB</scope><scope>87Z</scope><scope>8A9</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ANIOZ</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRAZJ</scope><scope>FRNLG</scope><scope>F~G</scope><scope>K60</scope><scope>K6~</scope><scope>L.-</scope><scope>L.0</scope><scope>M0C</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7749-6476</orcidid></search><sort><creationdate>20231101</creationdate><title>Pseudo-model-free hedging for variable annuities via deep reinforcement learning</title><author>Chong, Wing Fung ; Cui, Haoen ; Li, Yuxuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c450t-4aa306d693994ba35c8bd734a104578084f5a2faf6316306d3222104e1c3d9c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Actuarial science</topic><topic>Distance learning</topic><topic>Hedging</topic><topic>Insurance premiums</topic><topic>Machine learning</topic><topic>Mortality</topic><topic>Neural networks</topic><topic>Original Research Paper</topic><topic>Profits</topic><topic>Securities markets</topic><topic>Valuation</topic><topic>Variable annuities</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chong, Wing Fung</creatorcontrib><creatorcontrib>Cui, Haoen</creatorcontrib><creatorcontrib>Li, Yuxuan</creatorcontrib><collection>Cambridge Journals Open Access</collection><collection>ECONIS</collection><collection>CrossRef</collection><collection>Global News & ABI/Inform Professional</collection><collection>Trade PRO</collection><collection>ProQuest Central (Corporate)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>Accounting & Tax Database</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Accounting & Tax Database (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Accounting, Tax & Banking Collection</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Accounting, Tax & Banking Collection (Alumni)</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ABI/INFORM Professional Standard</collection><collection>ABI/INFORM Global</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>Annals of actuarial science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chong, Wing Fung</au><au>Cui, Haoen</au><au>Li, Yuxuan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pseudo-model-free hedging for variable annuities via deep reinforcement learning</atitle><jtitle>Annals of actuarial science</jtitle><addtitle>Ann. actuar. sci</addtitle><date>2023-11-01</date><risdate>2023</risdate><volume>17</volume><issue>3</issue><spage>503</spage><epage>546</epage><pages>503-546</pages><issn>1748-4995</issn><eissn>1748-5002</eissn><abstract>This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.</abstract><cop>Cambridge, UK</cop><pub>Cambridge University Press</pub><doi>10.1017/S1748499523000027</doi><tpages>44</tpages><orcidid>https://orcid.org/0000-0002-7749-6476</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1748-4995
ispartof	Annals of actuarial science, 2023-11, Vol.17 (3), p.503-546
issn	1748-4995 1748-5002
language	eng
recordid	cdi_proquest_journals_2881259343
source	Cambridge University Press Journals Complete
subjects	Actuarial science Distance learning Hedging Insurance premiums Machine learning Mortality Neural networks Original Research Paper Profits Securities markets Valuation Variable annuities
title	Pseudo-model-free hedging for variable annuities via deep reinforcement learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T23%3A09%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pseudo-model-free%20hedging%20for%20variable%20annuities%20via%20deep%20reinforcement%20learning&rft.jtitle=Annals%20of%20actuarial%20science&rft.au=Chong,%20Wing%20Fung&rft.date=2023-11-01&rft.volume=17&rft.issue=3&rft.spage=503&rft.epage=546&rft.pages=503-546&rft.issn=1748-4995&rft.eissn=1748-5002&rft_id=info:doi/10.1017/S1748499523000027&rft_dat=%3Cproquest_cross%3E2881259343%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2881259343&rft_id=info:pmid/&rft_cupid=10_1017_S1748499523000027&rfr_iscdi=true