Random forest classifier for multi-category classification of web pages
Web page classification is the automated assigning of predefined subject category to the document. Automatic Web page classification is one of the most essential techniques for Web mining given that the Web is a huge repository of various information including images, videos etc. And there is a need...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Web page classification is the automated assigning of predefined subject category to the document. Automatic Web page classification is one of the most essential techniques for Web mining given that the Web is a huge repository of various information including images, videos etc. And there is a need for categorization Web pages to satisfy user needs. The classification of Web pages into each category exclusively relies on man power which cost much time and effort. To alleviate this manually classification problem, more researchers focus on the issue of Web pages classification technology. In this paper, we proposed Random Forest Classifier (RF) based on random forest method for multi-category Web page classification. The proposed RF classifier can classify Web pages efficiently according to their corresponding class without using other feature selection methods. We compared the accuracy of the proposed approach to decision tree classifier using in the same Yahoo Web pages. The experiments have shown that the proposed approach is suitable for the multi-category Web page classification. |
---|---|
DOI: | 10.1109/APSCC.2009.5394100 |