Sensitive data identification method and device, equipment and storage medium

The embodiment of the invention provides a sensitive data identification method and device, equipment and a storage medium, and is applied to the technical field of machine learning. The method comprises the steps that a sample set is obtained, samples in the sample set comprise feature vectors and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: KONG WEIYU, YUAN KAIGUO, SHI MINGLEI, FU HAITAO, SI DAPENG, LU YIYUAN, SUN YANJIE
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The embodiment of the invention provides a sensitive data identification method and device, equipment and a storage medium, and is applied to the technical field of machine learning. The method comprises the steps that a sample set is obtained, samples in the sample set comprise feature vectors and labels corresponding to the feature vectors, and the labels are used for identifying whether data to which the feature vectors belong are sensitive data or not; performing oversampling on the sample set; and training a cost-sensitive model according to the oversampled sample set, and taking the trained cost-sensitive model as a sensitive data identification model. In this way, the overfitting problem caused by sample imbalance can be solved based on oversampling, the sensitive data recognition model with the high recognition capacity is obtained by training the cost sensitive model, then whether the data to be recognized are sensitive data or not is quickly and accurately recognized based on the model, and the sens