Learning with moment estimation using different time constants

A technique for training a model includes obtaining a training example for a model having model parameters stored on one or more computer readable storage mediums operably coupled to the hardware processor. The training example includes an outcome and features to explain the outcome. A gradient is c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: MORIMURA, Tetsuro
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A technique for training a model includes obtaining a training example for a model having model parameters stored on one or more computer readable storage mediums operably coupled to the hardware processor. The training example includes an outcome and features to explain the outcome. A gradient is calculated with respect to the model parameters of the model using the training example. Two estimates of a moment of the gradient with two different time constants are computed for the same type of the moment using the gradient. Using a hardware processor, the model parameters of the model are updated using the two estimates of the moment with the two different time constants to reduce errors while calculating the at least two estimates of the moment of the gradient.