Detecting and delaying effect of machine learning model attacks
One embodiment provides a method for delaying malicious attacks on machine learning models that a trained using input captured from a plurality of users, including: deploying a model, said model designed to be used with an application, for responding to requests received from users, wherein the mode...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | One embodiment provides a method for delaying malicious attacks on machine learning models that a trained using input captured from a plurality of users, including: deploying a model, said model designed to be used with an application, for responding to requests received from users, wherein the model comprises a machine learning model that has been previously trained using a data set; receiving input from one or more users; determining, using a malicious input detection technique, if the received input comprises malicious input; if the received input comprises malicious input, removing the malicious input from the input to be used to retrain the model; retraining the model using received input that is determined to not be malicious input; and providing, using the retrained model, a response to a received user query, the retrained model delaying the effect of malicious input on provided responses by removing malicious input from retraining input. |
---|