ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING
Interactions between a training server and a plurality of environment controllers are used for updating the weights of a predictive model used by a neural network executed by the plurality of environment controllers. Each environment controller executes the neural network using a current version of...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Gervais, Francois Lupien, Steve |
description | Interactions between a training server and a plurality of environment controllers are used for updating the weights of a predictive model used by a neural network executed by the plurality of environment controllers. Each environment controller executes the neural network using a current version of the predictive model to generate outputs based on inputs, modifies the outputs, and generates metrics representative of the effectiveness of the modified outputs for controlling the environment. The training server collects the inputs, the corresponding modified outputs, and the corresponding metrics from the plurality of environment controllers. The collected inputs, modified outputs and metrics are used by the training server for updating the weights of the current predictive model through reinforcement learning. A new predictive model comprising the updated weights is transmitted to the environment controllers to be used in place of the current predictive model. |
format | Patent |
fullrecord | <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_EP3786732A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>EP3786732A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_EP3786732A13</originalsourceid><addsrcrecordid>eNqNjTEKwkAQRdNYiHqHuYCFBoztmp1kFzczMk5iGYKslWggnsCTu4gHsHrw-Lw_z95InRemBkmhZFLhEFDAkIUG1bGFigVqJBSjnmowcBK0vlTfITRsMQBXyRK2YkKCXliOoE64rR1Yf1bxh1bRgqCnVCvx-xbQCKXiMpvdhvsUVz8uMqhQS7eO47OP0zhc4yO-ejzlxX5X5Fuzyf-YfACjFDy-</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING</title><source>esp@cenet</source><creator>Gervais, Francois ; Lupien, Steve</creator><creatorcontrib>Gervais, Francois ; Lupien, Steve</creatorcontrib><description>Interactions between a training server and a plurality of environment controllers are used for updating the weights of a predictive model used by a neural network executed by the plurality of environment controllers. Each environment controller executes the neural network using a current version of the predictive model to generate outputs based on inputs, modifies the outputs, and generates metrics representative of the effectiveness of the modified outputs for controlling the environment. The training server collects the inputs, the corresponding modified outputs, and the corresponding metrics from the plurality of environment controllers. The collected inputs, modified outputs and metrics are used by the training server for updating the weights of the current predictive model through reinforcement learning. A new predictive model comprising the updated weights is transmitted to the environment controllers to be used in place of the current predictive model.</description><language>eng ; fre ; ger</language><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL ; CONTROLLING ; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS ; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS ; PHYSICS ; REGULATING</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210303&DB=EPODOC&CC=EP&NR=3786732A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25542,76290</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210303&DB=EPODOC&CC=EP&NR=3786732A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Gervais, Francois</creatorcontrib><creatorcontrib>Lupien, Steve</creatorcontrib><title>ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING</title><description>Interactions between a training server and a plurality of environment controllers are used for updating the weights of a predictive model used by a neural network executed by the plurality of environment controllers. Each environment controller executes the neural network using a current version of the predictive model to generate outputs based on inputs, modifies the outputs, and generates metrics representative of the effectiveness of the modified outputs for controlling the environment. The training server collects the inputs, the corresponding modified outputs, and the corresponding metrics from the plurality of environment controllers. The collected inputs, modified outputs and metrics are used by the training server for updating the weights of the current predictive model through reinforcement learning. A new predictive model comprising the updated weights is transmitted to the environment controllers to be used in place of the current predictive model.</description><subject>CONTROL OR REGULATING SYSTEMS IN GENERAL</subject><subject>CONTROLLING</subject><subject>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</subject><subject>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</subject><subject>PHYSICS</subject><subject>REGULATING</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNjTEKwkAQRdNYiHqHuYCFBoztmp1kFzczMk5iGYKslWggnsCTu4gHsHrw-Lw_z95InRemBkmhZFLhEFDAkIUG1bGFigVqJBSjnmowcBK0vlTfITRsMQBXyRK2YkKCXliOoE64rR1Yf1bxh1bRgqCnVCvx-xbQCKXiMpvdhvsUVz8uMqhQS7eO47OP0zhc4yO-ejzlxX5X5Fuzyf-YfACjFDy-</recordid><startdate>20210303</startdate><enddate>20210303</enddate><creator>Gervais, Francois</creator><creator>Lupien, Steve</creator><scope>EVB</scope></search><sort><creationdate>20210303</creationdate><title>ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING</title><author>Gervais, Francois ; Lupien, Steve</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_EP3786732A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre ; ger</language><creationdate>2021</creationdate><topic>CONTROL OR REGULATING SYSTEMS IN GENERAL</topic><topic>CONTROLLING</topic><topic>FUNCTIONAL ELEMENTS OF SUCH SYSTEMS</topic><topic>MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS</topic><topic>PHYSICS</topic><topic>REGULATING</topic><toplevel>online_resources</toplevel><creatorcontrib>Gervais, Francois</creatorcontrib><creatorcontrib>Lupien, Steve</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gervais, Francois</au><au>Lupien, Steve</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING</title><date>2021-03-03</date><risdate>2021</risdate><abstract>Interactions between a training server and a plurality of environment controllers are used for updating the weights of a predictive model used by a neural network executed by the plurality of environment controllers. Each environment controller executes the neural network using a current version of the predictive model to generate outputs based on inputs, modifies the outputs, and generates metrics representative of the effectiveness of the modified outputs for controlling the environment. The training server collects the inputs, the corresponding modified outputs, and the corresponding metrics from the plurality of environment controllers. The collected inputs, modified outputs and metrics are used by the training server for updating the weights of the current predictive model through reinforcement learning. A new predictive model comprising the updated weights is transmitted to the environment controllers to be used in place of the current predictive model.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng ; fre ; ger |
recordid | cdi_epo_espacenet_EP3786732A1 |
source | esp@cenet |
subjects | CONTROL OR REGULATING SYSTEMS IN GENERAL CONTROLLING FUNCTIONAL ELEMENTS OF SUCH SYSTEMS MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS PHYSICS REGULATING |
title | ENVIRONMENT CONTROLLER AND METHOD FOR GENERATING A PREDICTIVE MODEL OF A NEURAL NETWORK THROUGH DISTRIBUTED REINFORCEMENT LEARNING |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T04%3A08%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Gervais,%20Francois&rft.date=2021-03-03&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EEP3786732A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |