Improvement of Mutual Information based on TF-CA-CI algorithm

Mutual Information algorithm for text feature selection usually tends to select the rare terms. In allusion to this limitation, this paper makes use of the term frequency, the coupling factor among classes and the cohesion degree inside a class to the MI algorithm, and proposes an improved MI approa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jiajia Chai, Dexian Zhang, Ruihuan Geng
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Mutual Information algorithm for text feature selection usually tends to select the rare terms. In allusion to this limitation, this paper makes use of the term frequency, the coupling factor among classes and the cohesion degree inside a class to the MI algorithm, and proposes an improved MI approach based on TFCA-CI algorithm. The experimental result shows that the improved method can effectively control the randomness of the MI method appeared in the process of feature selection when the dimension is low, and achieve a better classified results. So the effectiveness and feasibility of the improved method is achieved.
DOI:10.1109/GrC.2012.6468636