Phased deployment of deep-learning models to customer facing APIs

Techniques for phased deployment of machine learning models are described. Customers can call a training API to initiate model training, but then must wait while the training completes before the model can be used to perform inference. Depending on the type of model, machine learning algorithm being...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Bodapati, Sravan Babu, Leen, David
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Techniques for phased deployment of machine learning models are described. Customers can call a training API to initiate model training, but then must wait while the training completes before the model can be used to perform inference. Depending on the type of model, machine learning algorithm being used for training, size of the training dataset, etc. this training process may take hours or days to complete. This leads to significant downtime where inference requests cannot be served. Embodiments improve upon existing systems by providing phased deployment of custom models. For example, a simple, less accurate model, can be provided synchronously in response to a request for a custom model. At the same time, one or more machine learning models can be trained asynchronously in the background. When the machine learning model is ready for use, the customers' traffic and jobs can be transferred over to the better model.