Cleaning method for data classification and training databases based on mixed norm
The invention discloses a cleaning method for data classification and training databases based on mixed norm with the purpose of reducing the number of training samples and dimensions by a large margin. The core technology of the method comprises following steps: firstly, pre-processing databases gi...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a cleaning method for data classification and training databases based on mixed norm with the purpose of reducing the number of training samples and dimensions by a large margin. The core technology of the method comprises following steps: firstly, pre-processing databases given by a user, wherein the pre-processing step comprises the steps of handling missing data and pre-cutting data sets; secondly, utilizing the mixed norm (including zero norm, a norm and infinite norm) and data-related technology for extracting representative samples from the database; thirdly, utilizing orthogonalization technology to select optimal samples and complementing group samples with the number of sample size being zero according to classification labels. Based on the above process, the method is capable of greatly reducing modeling time and memory time of data classification in order to increase learning efficiency. Under the condition that cleaning efficiency indictors are given, we select normally-use |
---|