Multi-objectivization and ensembles of shapings in reinforcement learning

Ensemble techniques are a powerful approach to creating better decision makers in machine learning. Multiple decision makers are trained to solve a given task, grouped in an ensemble, and their decisions are aggregated. The ensemble derives its power from the diversity of its components, as the assu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neurocomputing (Amsterdam) 2017-11, Vol.263, p.48-59
Hauptverfasser:	Brys, Tim, Harutyunyan, Anna, Vrancx, Peter, Nowé, Ann, Taylor, Matthew E.
Format:	Artikel
Sprache:	eng
Schlagworte:	algorithms artificial intelligence decision making Ensemble techniques Multi-objectivization objectives problem solving Reinforcement learning Reward shaping sampling shape solutions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	59
container_issue
container_start_page	48
container_title	Neurocomputing (Amsterdam)
container_volume	263
creator	Brys, Tim Harutyunyan, Anna Vrancx, Peter Nowé, Ann Taylor, Matthew E.
description	Ensemble techniques are a powerful approach to creating better decision makers in machine learning. Multiple decision makers are trained to solve a given task, grouped in an ensemble, and their decisions are aggregated. The ensemble derives its power from the diversity of its components, as the assumption is that they make mistakes on different inputs, and that the majority is more likely to be correct than any individual component. Diversity usually comes from the different algorithms employed by the decision makers, or the different inputs used to train the decision makers. We advocate a third way to achieve this diversity, called diversity of evaluation, using the principle of multi-objectivization. This is the process of taking a single-objective problem and transforming it into a multi-objective problem in order to solve the original problem faster and/or better. This is either done through decomposition of the original objective, or the addition of extra objectives, typically based on some (heuristic) domain knowledge. This process basically creates a diverse set of feedback signals for what is underneath still a single-objective problem. In the context of ensemble techniques, these various ways to evaluate a (solution to a) problem allow different components of the ensemble to look at the problem in different ways, generating the necessary diversity for the ensemble. In this paper, we argue for the combination of multi-objectivization and ensemble techniques as a powerful tool to boost solving performance in reinforcement learning. We inject various pieces of heuristic information through reward shaping, creating several distinct enriched reward signals, which can strategically be combined using ensemble techniques to reduce sample complexity. We provide theoretical guarantees and demonstrate the potential of the approach with a range of experiments.
doi_str_mv	10.1016/j.neucom.2017.02.096
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2675559288</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0925231217310962</els_id><sourcerecordid>2675559288</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-f2caa59a6839b34a7952385a12f632e30e3f436fca962da024e2108b444864aa3</originalsourceid><addsrcrecordid>eNp9kD1PwzAURS0EEqXwDxgysiTYz47jLEio4qNSEQvMluO8gKvELnZSCX49qcrM9IZ77pXeIeSa0YJRJm-3hcfJhqEAyqqCQkFreUIWTFWQK1DylCxoDWUOnME5uUhpS2eQQb0g65epH10emi3a0e3djxld8JnxbYY-4dD0mLLQZenT7Jz_SJnzWUTnuxAtDujHrEcT_RxdkrPO9Amv_u6SvD8-vK2e883r03p1v8mt4GrMO7DGlLWRitcNF6aqS-CqNAw6yQE5Rd4JLjtragmtoSAQGFWNEEJJYQxfkpvj7i6GrwnTqAeXLPa98RimpEFWZVnWoNSMiiNqY0gpYqd30Q0mfmtG9cGc3uqjOX0wpyno2dxcuzvWcH5j7zDqZB16i62LsyXdBvf_wC94enl6</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2675559288</pqid></control><display><type>article</type><title>Multi-objectivization and ensembles of shapings in reinforcement learning</title><source>Elsevier ScienceDirect Journals</source><creator>Brys, Tim ; Harutyunyan, Anna ; Vrancx, Peter ; Nowé, Ann ; Taylor, Matthew E.</creator><creatorcontrib>Brys, Tim ; Harutyunyan, Anna ; Vrancx, Peter ; Nowé, Ann ; Taylor, Matthew E.</creatorcontrib><description>Ensemble techniques are a powerful approach to creating better decision makers in machine learning. Multiple decision makers are trained to solve a given task, grouped in an ensemble, and their decisions are aggregated. The ensemble derives its power from the diversity of its components, as the assumption is that they make mistakes on different inputs, and that the majority is more likely to be correct than any individual component. Diversity usually comes from the different algorithms employed by the decision makers, or the different inputs used to train the decision makers. We advocate a third way to achieve this diversity, called diversity of evaluation, using the principle of multi-objectivization. This is the process of taking a single-objective problem and transforming it into a multi-objective problem in order to solve the original problem faster and/or better. This is either done through decomposition of the original objective, or the addition of extra objectives, typically based on some (heuristic) domain knowledge. This process basically creates a diverse set of feedback signals for what is underneath still a single-objective problem. In the context of ensemble techniques, these various ways to evaluate a (solution to a) problem allow different components of the ensemble to look at the problem in different ways, generating the necessary diversity for the ensemble. In this paper, we argue for the combination of multi-objectivization and ensemble techniques as a powerful tool to boost solving performance in reinforcement learning. We inject various pieces of heuristic information through reward shaping, creating several distinct enriched reward signals, which can strategically be combined using ensemble techniques to reduce sample complexity. We provide theoretical guarantees and demonstrate the potential of the approach with a range of experiments.</description><identifier>ISSN: 0925-2312</identifier><identifier>EISSN: 1872-8286</identifier><identifier>DOI: 10.1016/j.neucom.2017.02.096</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>algorithms ; artificial intelligence ; decision making ; Ensemble techniques ; Multi-objectivization ; objectives ; problem solving ; Reinforcement learning ; Reward shaping ; sampling ; shape ; solutions</subject><ispartof>Neurocomputing (Amsterdam), 2017-11, Vol.263, p.48-59</ispartof><rights>2017 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-f2caa59a6839b34a7952385a12f632e30e3f436fca962da024e2108b444864aa3</citedby><cites>FETCH-LOGICAL-c438t-f2caa59a6839b34a7952385a12f632e30e3f436fca962da024e2108b444864aa3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0925231217310962$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Brys, Tim</creatorcontrib><creatorcontrib>Harutyunyan, Anna</creatorcontrib><creatorcontrib>Vrancx, Peter</creatorcontrib><creatorcontrib>Nowé, Ann</creatorcontrib><creatorcontrib>Taylor, Matthew E.</creatorcontrib><title>Multi-objectivization and ensembles of shapings in reinforcement learning</title><title>Neurocomputing (Amsterdam)</title><description>Ensemble techniques are a powerful approach to creating better decision makers in machine learning. Multiple decision makers are trained to solve a given task, grouped in an ensemble, and their decisions are aggregated. The ensemble derives its power from the diversity of its components, as the assumption is that they make mistakes on different inputs, and that the majority is more likely to be correct than any individual component. Diversity usually comes from the different algorithms employed by the decision makers, or the different inputs used to train the decision makers. We advocate a third way to achieve this diversity, called diversity of evaluation, using the principle of multi-objectivization. This is the process of taking a single-objective problem and transforming it into a multi-objective problem in order to solve the original problem faster and/or better. This is either done through decomposition of the original objective, or the addition of extra objectives, typically based on some (heuristic) domain knowledge. This process basically creates a diverse set of feedback signals for what is underneath still a single-objective problem. In the context of ensemble techniques, these various ways to evaluate a (solution to a) problem allow different components of the ensemble to look at the problem in different ways, generating the necessary diversity for the ensemble. In this paper, we argue for the combination of multi-objectivization and ensemble techniques as a powerful tool to boost solving performance in reinforcement learning. We inject various pieces of heuristic information through reward shaping, creating several distinct enriched reward signals, which can strategically be combined using ensemble techniques to reduce sample complexity. We provide theoretical guarantees and demonstrate the potential of the approach with a range of experiments.</description><subject>algorithms</subject><subject>artificial intelligence</subject><subject>decision making</subject><subject>Ensemble techniques</subject><subject>Multi-objectivization</subject><subject>objectives</subject><subject>problem solving</subject><subject>Reinforcement learning</subject><subject>Reward shaping</subject><subject>sampling</subject><subject>shape</subject><subject>solutions</subject><issn>0925-2312</issn><issn>1872-8286</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNp9kD1PwzAURS0EEqXwDxgysiTYz47jLEio4qNSEQvMluO8gKvELnZSCX49qcrM9IZ77pXeIeSa0YJRJm-3hcfJhqEAyqqCQkFreUIWTFWQK1DylCxoDWUOnME5uUhpS2eQQb0g65epH10emi3a0e3djxld8JnxbYY-4dD0mLLQZenT7Jz_SJnzWUTnuxAtDujHrEcT_RxdkrPO9Amv_u6SvD8-vK2e883r03p1v8mt4GrMO7DGlLWRitcNF6aqS-CqNAw6yQE5Rd4JLjtragmtoSAQGFWNEEJJYQxfkpvj7i6GrwnTqAeXLPa98RimpEFWZVnWoNSMiiNqY0gpYqd30Q0mfmtG9cGc3uqjOX0wpyno2dxcuzvWcH5j7zDqZB16i62LsyXdBvf_wC94enl6</recordid><startdate>20171108</startdate><enddate>20171108</enddate><creator>Brys, Tim</creator><creator>Harutyunyan, Anna</creator><creator>Vrancx, Peter</creator><creator>Nowé, Ann</creator><creator>Taylor, Matthew E.</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7S9</scope><scope>L.6</scope></search><sort><creationdate>20171108</creationdate><title>Multi-objectivization and ensembles of shapings in reinforcement learning</title><author>Brys, Tim ; Harutyunyan, Anna ; Vrancx, Peter ; Nowé, Ann ; Taylor, Matthew E.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-f2caa59a6839b34a7952385a12f632e30e3f436fca962da024e2108b444864aa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>algorithms</topic><topic>artificial intelligence</topic><topic>decision making</topic><topic>Ensemble techniques</topic><topic>Multi-objectivization</topic><topic>objectives</topic><topic>problem solving</topic><topic>Reinforcement learning</topic><topic>Reward shaping</topic><topic>sampling</topic><topic>shape</topic><topic>solutions</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Brys, Tim</creatorcontrib><creatorcontrib>Harutyunyan, Anna</creatorcontrib><creatorcontrib>Vrancx, Peter</creatorcontrib><creatorcontrib>Nowé, Ann</creatorcontrib><creatorcontrib>Taylor, Matthew E.</creatorcontrib><collection>CrossRef</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><jtitle>Neurocomputing (Amsterdam)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Brys, Tim</au><au>Harutyunyan, Anna</au><au>Vrancx, Peter</au><au>Nowé, Ann</au><au>Taylor, Matthew E.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-objectivization and ensembles of shapings in reinforcement learning</atitle><jtitle>Neurocomputing (Amsterdam)</jtitle><date>2017-11-08</date><risdate>2017</risdate><volume>263</volume><spage>48</spage><epage>59</epage><pages>48-59</pages><issn>0925-2312</issn><eissn>1872-8286</eissn><abstract>Ensemble techniques are a powerful approach to creating better decision makers in machine learning. Multiple decision makers are trained to solve a given task, grouped in an ensemble, and their decisions are aggregated. The ensemble derives its power from the diversity of its components, as the assumption is that they make mistakes on different inputs, and that the majority is more likely to be correct than any individual component. Diversity usually comes from the different algorithms employed by the decision makers, or the different inputs used to train the decision makers. We advocate a third way to achieve this diversity, called diversity of evaluation, using the principle of multi-objectivization. This is the process of taking a single-objective problem and transforming it into a multi-objective problem in order to solve the original problem faster and/or better. This is either done through decomposition of the original objective, or the addition of extra objectives, typically based on some (heuristic) domain knowledge. This process basically creates a diverse set of feedback signals for what is underneath still a single-objective problem. In the context of ensemble techniques, these various ways to evaluate a (solution to a) problem allow different components of the ensemble to look at the problem in different ways, generating the necessary diversity for the ensemble. In this paper, we argue for the combination of multi-objectivization and ensemble techniques as a powerful tool to boost solving performance in reinforcement learning. We inject various pieces of heuristic information through reward shaping, creating several distinct enriched reward signals, which can strategically be combined using ensemble techniques to reduce sample complexity. We provide theoretical guarantees and demonstrate the potential of the approach with a range of experiments.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.neucom.2017.02.096</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0925-2312
ispartof	Neurocomputing (Amsterdam), 2017-11, Vol.263, p.48-59
issn	0925-2312 1872-8286
language	eng
recordid	cdi_proquest_miscellaneous_2675559288
source	Elsevier ScienceDirect Journals
subjects	algorithms artificial intelligence decision making Ensemble techniques Multi-objectivization objectives problem solving Reinforcement learning Reward shaping sampling shape solutions
title	Multi-objectivization and ensembles of shapings in reinforcement learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T16%3A07%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-objectivization%20and%20ensembles%20of%20shapings%20in%20reinforcement%20learning&rft.jtitle=Neurocomputing%20(Amsterdam)&rft.au=Brys,%20Tim&rft.date=2017-11-08&rft.volume=263&rft.spage=48&rft.epage=59&rft.pages=48-59&rft.issn=0925-2312&rft.eissn=1872-8286&rft_id=info:doi/10.1016/j.neucom.2017.02.096&rft_dat=%3Cproquest_cross%3E2675559288%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2675559288&rft_id=info:pmid/&rft_els_id=S0925231217310962&rfr_iscdi=true