Cleaning method for data classification and training databases based on mixed norm

The invention discloses a cleaning method for data classification and training databases based on mixed norm with the purpose of reducing the number of training samples and dimensions by a large margin. The core technology of the method comprises following steps: firstly, pre-processing databases gi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GU YIYI, YIN YICHAO, QIU WENQIANG, YUAN YUBEI, ZHAO TINGTING, PU DONGMEI, GAO JU, RUAN TONG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a cleaning method for data classification and training databases based on mixed norm with the purpose of reducing the number of training samples and dimensions by a large margin. The core technology of the method comprises following steps: firstly, pre-processing databases given by a user, wherein the pre-processing step comprises the steps of handling missing data and pre-cutting data sets; secondly, utilizing the mixed norm (including zero norm, a norm and infinite norm) and data-related technology for extracting representative samples from the database; thirdly, utilizing orthogonalization technology to select optimal samples and complementing group samples with the number of sample size being zero according to classification labels. Based on the above process, the method is capable of greatly reducing modeling time and memory time of data classification in order to increase learning efficiency. Under the condition that cleaning efficiency indictors are given, we select normally-use