Detecting and delaying effect of machine learning model attacks

One embodiment provides a method for delaying malicious attacks on machine learning models that a trained using input captured from a plurality of users, including: deploying a model, said model designed to be used with an application, for responding to requests received from users, wherein the mode...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kesarwani, Manish, Kumar, Atul, Pimplikar, Rakesh R, Arya, Vijay, Mehta, Sameep
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:One embodiment provides a method for delaying malicious attacks on machine learning models that a trained using input captured from a plurality of users, including: deploying a model, said model designed to be used with an application, for responding to requests received from users, wherein the model comprises a machine learning model that has been previously trained using a data set; receiving input from one or more users; determining, using a malicious input detection technique, if the received input comprises malicious input; if the received input comprises malicious input, removing the malicious input from the input to be used to retrain the model; retraining the model using received input that is determined to not be malicious input; and providing, using the retrained model, a response to a received user query, the retrained model delaying the effect of malicious input on provided responses by removing malicious input from retraining input.