Human-level performance in 3D multiplayer games with populationbased reinforcement learning

Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evalu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Science (American Association for the Advancement of Science) 2019-05, Vol.364 (6443), p.859-865
Hauptverfasser:	Jaderberg, Max, Czarnecki, Wojciech M., Dunning, Iain, Marris, Luke, Lever, Guy, Castañeda, Antonio Garcia, Beattie, Charles, Rabinowitz, Neil C., Morcos, Ari S., Ruderman, Avraham, Sonnerat, Nicolas, Green, Tim, Deason, Louise, Leibo, Joel Z., Silver, David, Hassabis, Demis, Kavukcuoglu, Koray, Graepel, Thore
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	865
container_issue	6443
container_start_page	859
container_title	Science (American Association for the Advancement of Science)
container_volume	364
creator	Jaderberg, Max Czarnecki, Wojciech M. Dunning, Iain Marris, Luke Lever, Guy Castañeda, Antonio Garcia Beattie, Charles Rabinowitz, Neil C. Morcos, Ari S. Ruderman, Avraham Sonnerat, Nicolas Green, Tim Deason, Louise Leibo, Joel Z. Silver, David Hassabis, Demis Kavukcuoglu, Koray Graepel, Thore
description	Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input.We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.
format	Article
fullrecord	<record><control><sourceid>jstor</sourceid><recordid>TN_cdi_jstor_primary_26681482</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26681482</jstor_id><sourcerecordid>26681482</sourcerecordid><originalsourceid>FETCH-jstor_primary_266814823</originalsourceid><addsrcrecordid>eNqFi0sKwjAUAIMoWD9HEN4FAmlja7v2Qw_gzoXE-lpT8iNJld7eLty7GoZhZiRJWZXTKmN8ThLGeEFLdsiXZBVCz9jUKp6QWz1oYajCNypw6FvrJ28QpAF-Aj2oKJ0SI3rohMYAHxlf4KwblIjSmocI-ASP0kxngxpNBIXCG2m6DVm0QgXc_rgmu8v5eqxpH6L1d-elFn68Z0VRpvsy4__6F64DQPY</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Human-level performance in 3D multiplayer games with populationbased reinforcement learning</title><source>American Association for the Advancement of Science</source><creator>Jaderberg, Max ; Czarnecki, Wojciech M. ; Dunning, Iain ; Marris, Luke ; Lever, Guy ; Castañeda, Antonio Garcia ; Beattie, Charles ; Rabinowitz, Neil C. ; Morcos, Ari S. ; Ruderman, Avraham ; Sonnerat, Nicolas ; Green, Tim ; Deason, Louise ; Leibo, Joel Z. ; Silver, David ; Hassabis, Demis ; Kavukcuoglu, Koray ; Graepel, Thore</creator><creatorcontrib>Jaderberg, Max ; Czarnecki, Wojciech M. ; Dunning, Iain ; Marris, Luke ; Lever, Guy ; Castañeda, Antonio Garcia ; Beattie, Charles ; Rabinowitz, Neil C. ; Morcos, Ari S. ; Ruderman, Avraham ; Sonnerat, Nicolas ; Green, Tim ; Deason, Louise ; Leibo, Joel Z. ; Silver, David ; Hassabis, Demis ; Kavukcuoglu, Koray ; Graepel, Thore</creatorcontrib><description>Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input.We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.</description><identifier>ISSN: 0036-8075</identifier><identifier>EISSN: 1095-9203</identifier><language>eng</language><publisher>American Association for the Advancement of Science</publisher><ispartof>Science (American Association for the Advancement of Science), 2019-05, Vol.364 (6443), p.859-865</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780</link.rule.ids></links><search><creatorcontrib>Jaderberg, Max</creatorcontrib><creatorcontrib>Czarnecki, Wojciech M.</creatorcontrib><creatorcontrib>Dunning, Iain</creatorcontrib><creatorcontrib>Marris, Luke</creatorcontrib><creatorcontrib>Lever, Guy</creatorcontrib><creatorcontrib>Castañeda, Antonio Garcia</creatorcontrib><creatorcontrib>Beattie, Charles</creatorcontrib><creatorcontrib>Rabinowitz, Neil C.</creatorcontrib><creatorcontrib>Morcos, Ari S.</creatorcontrib><creatorcontrib>Ruderman, Avraham</creatorcontrib><creatorcontrib>Sonnerat, Nicolas</creatorcontrib><creatorcontrib>Green, Tim</creatorcontrib><creatorcontrib>Deason, Louise</creatorcontrib><creatorcontrib>Leibo, Joel Z.</creatorcontrib><creatorcontrib>Silver, David</creatorcontrib><creatorcontrib>Hassabis, Demis</creatorcontrib><creatorcontrib>Kavukcuoglu, Koray</creatorcontrib><creatorcontrib>Graepel, Thore</creatorcontrib><title>Human-level performance in 3D multiplayer games with populationbased reinforcement learning</title><title>Science (American Association for the Advancement of Science)</title><description>Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input.We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.</description><issn>0036-8075</issn><issn>1095-9203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid/><recordid>eNqFi0sKwjAUAIMoWD9HEN4FAmlja7v2Qw_gzoXE-lpT8iNJld7eLty7GoZhZiRJWZXTKmN8ThLGeEFLdsiXZBVCz9jUKp6QWz1oYajCNypw6FvrJ28QpAF-Aj2oKJ0SI3rohMYAHxlf4KwblIjSmocI-ASP0kxngxpNBIXCG2m6DVm0QgXc_rgmu8v5eqxpH6L1d-elFn68Z0VRpvsy4__6F64DQPY</recordid><startdate>20190531</startdate><enddate>20190531</enddate><creator>Jaderberg, Max</creator><creator>Czarnecki, Wojciech M.</creator><creator>Dunning, Iain</creator><creator>Marris, Luke</creator><creator>Lever, Guy</creator><creator>Castañeda, Antonio Garcia</creator><creator>Beattie, Charles</creator><creator>Rabinowitz, Neil C.</creator><creator>Morcos, Ari S.</creator><creator>Ruderman, Avraham</creator><creator>Sonnerat, Nicolas</creator><creator>Green, Tim</creator><creator>Deason, Louise</creator><creator>Leibo, Joel Z.</creator><creator>Silver, David</creator><creator>Hassabis, Demis</creator><creator>Kavukcuoglu, Koray</creator><creator>Graepel, Thore</creator><general>American Association for the Advancement of Science</general><scope/></search><sort><creationdate>20190531</creationdate><title>Human-level performance in 3D multiplayer games with populationbased reinforcement learning</title><author>Jaderberg, Max ; Czarnecki, Wojciech M. ; Dunning, Iain ; Marris, Luke ; Lever, Guy ; Castañeda, Antonio Garcia ; Beattie, Charles ; Rabinowitz, Neil C. ; Morcos, Ari S. ; Ruderman, Avraham ; Sonnerat, Nicolas ; Green, Tim ; Deason, Louise ; Leibo, Joel Z. ; Silver, David ; Hassabis, Demis ; Kavukcuoglu, Koray ; Graepel, Thore</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-jstor_primary_266814823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jaderberg, Max</creatorcontrib><creatorcontrib>Czarnecki, Wojciech M.</creatorcontrib><creatorcontrib>Dunning, Iain</creatorcontrib><creatorcontrib>Marris, Luke</creatorcontrib><creatorcontrib>Lever, Guy</creatorcontrib><creatorcontrib>Castañeda, Antonio Garcia</creatorcontrib><creatorcontrib>Beattie, Charles</creatorcontrib><creatorcontrib>Rabinowitz, Neil C.</creatorcontrib><creatorcontrib>Morcos, Ari S.</creatorcontrib><creatorcontrib>Ruderman, Avraham</creatorcontrib><creatorcontrib>Sonnerat, Nicolas</creatorcontrib><creatorcontrib>Green, Tim</creatorcontrib><creatorcontrib>Deason, Louise</creatorcontrib><creatorcontrib>Leibo, Joel Z.</creatorcontrib><creatorcontrib>Silver, David</creatorcontrib><creatorcontrib>Hassabis, Demis</creatorcontrib><creatorcontrib>Kavukcuoglu, Koray</creatorcontrib><creatorcontrib>Graepel, Thore</creatorcontrib><jtitle>Science (American Association for the Advancement of Science)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jaderberg, Max</au><au>Czarnecki, Wojciech M.</au><au>Dunning, Iain</au><au>Marris, Luke</au><au>Lever, Guy</au><au>Castañeda, Antonio Garcia</au><au>Beattie, Charles</au><au>Rabinowitz, Neil C.</au><au>Morcos, Ari S.</au><au>Ruderman, Avraham</au><au>Sonnerat, Nicolas</au><au>Green, Tim</au><au>Deason, Louise</au><au>Leibo, Joel Z.</au><au>Silver, David</au><au>Hassabis, Demis</au><au>Kavukcuoglu, Koray</au><au>Graepel, Thore</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Human-level performance in 3D multiplayer games with populationbased reinforcement learning</atitle><jtitle>Science (American Association for the Advancement of Science)</jtitle><date>2019-05-31</date><risdate>2019</risdate><volume>364</volume><issue>6443</issue><spage>859</spage><epage>865</epage><pages>859-865</pages><issn>0036-8075</issn><eissn>1095-9203</eissn><abstract>Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input.We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.</abstract><pub>American Association for the Advancement of Science</pub></addata></record>
fulltext	fulltext
identifier	ISSN: 0036-8075
ispartof	Science (American Association for the Advancement of Science), 2019-05, Vol.364 (6443), p.859-865
issn	0036-8075 1095-9203
language	eng
recordid	cdi_jstor_primary_26681482
source	American Association for the Advancement of Science
title	Human-level performance in 3D multiplayer games with populationbased reinforcement learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T15%3A18%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Human-level%20performance%20in%203D%20multiplayer%20games%20with%20populationbased%20reinforcement%20learning&rft.jtitle=Science%20(American%20Association%20for%20the%20Advancement%20of%20Science)&rft.au=Jaderberg,%20Max&rft.date=2019-05-31&rft.volume=364&rft.issue=6443&rft.spage=859&rft.epage=865&rft.pages=859-865&rft.issn=0036-8075&rft.eissn=1095-9203&rft_id=info:doi/&rft_dat=%3Cjstor%3E26681482%3C/jstor%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=26681482&rfr_iscdi=true