Grindstone4Spam: An optimization toolkit for boosting e-mail classification

► Toolkit for improving the performance of content-based e-mail classifiers. ► Theoretical and practical issues in anti-spam filtering domain. ► Development, optimization and maintenance of anti-spam filters. ► Shortcomings of SpamAssassin framework for spam filtering. Resulting from the huge expans...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of systems and software 2012-12, Vol.85 (12), p.2909-2920
Hauptverfasser: Méndez, José R., Reboiro-Jato, M., Díaz, Fernando, Díaz, Eduardo, Fdez-Riverola, Florentino
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:► Toolkit for improving the performance of content-based e-mail classifiers. ► Theoretical and practical issues in anti-spam filtering domain. ► Development, optimization and maintenance of anti-spam filters. ► Shortcomings of SpamAssassin framework for spam filtering. Resulting from the huge expansion of Internet usage, the problem of unsolicited commercial e-mail (UCE) has grown astronomically. Although a good number of successful content-based anti-spam filters are available, their current utilization in real scenarios is still a long way off. In this context, the SpamAssassin filter offers a rule-based framework that can be easily used as a powerful integration and deployment tool for the fast development of new anti-spam strategies. This paper presents Grindstone4Spam, a publicly available optimization toolkit for boosting SpamAssassin performance. Its applicability has been verified by comparing its results with those obtained by the default SpamAssassin software as well as four well-known anti-spam filtering techniques such as Naïve Bayes, Flexible Bayes, Adaboost and Support Vector Machines in two different case studies. The performance of the proposed alternative clearly outperforms existing approaches working in a cost-sensitive scenario.
ISSN:0164-1212
1873-1228
DOI:10.1016/j.jss.2012.06.027